selectors
Subpackages
Submodules
- selectors.attributes
- selectors.combinators
- selectors.general
- selectors.logical
- selectors.relative
Content
Subpackage with soup selectors, which define declarative procedure of searching for elements in the document.
- class TypeSelector(name: str)[source]
Bases:
SoupSelector,SelectableCSSSelector for finding elements based on tag name (type). Counterpart of css type selectors.
Example
>>> TypeSelector("div")
matches all elements that have “div” tag name.
Example
>>> <div class="widget">Hello World</div> ✔️ >>> <a href="/shop">Hello World</a> ❌
CSS counterpart can be represented as:
Example
>>> div
And can be retrieved with css property.
Example
>>> TypeSelector("div").css "div"
Parameters
- namestr
Tag name of the element ex. “a”, “div”.
Notes
For more information about type selectors, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Type_selectors
- name: str
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- property css: str
Returns string representing element css selector.
- __init__(name: str) None
- class AttributeSelector(name: str, value: Pattern[str] | str | None = None)[source]
Bases:
SoupSelectorSelector for searching element based on its attribute value. Counterpart of css attribute selectors, that extends its capability with regex pattern matching.
Example
>>> AttributeSelector(name="role", value="widget")
matches all elements that have ‘role’ attribute with value “widget”.
Example
>>> <div role="widget">Hello World</div> ✔️ >>> <div class="menu">Hello World</div> ❌ >>> <div role="menu">Hello World</div> ❌
CSS counterpart can be represented as:
Example
>>> [role="widget"]
In case of using regex pattern, re.search is used to match the attribute value.
Example
>>> AttributeSelector(name="href", value=re.compile(r"wikipedia"))
Parameters
- namestr
HTML element attribute name ex. “class”, “href”
- valuestr | Pattern, optional
Value of the attribute to match. By default None, if not provided, default pattern matching any sequence of characters is used.
Notes
For more information about attribute selectors, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors
- name: str
- value: Pattern[str] | str | None = None
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- __init__(name: str, value: Pattern[str] | str | None = None) None
- class ClassSelector(value: Pattern[str] | str | None = None)[source]
Bases:
SpecificAttributeSelectorSpecific AttributeSelector for matching elements based on ‘class’ attribute value.
Example
>>> ClassSelector("widget")
matches all elements that have ‘class’ attribute with value “widget”.
Example
>>> <div class="widget">Hello World</div> ✔️ >>> <div class="content">Hello World</div> ❌
ClassSelector is a convenience wrapper for AttributeSelector, thus example above is equivalent to using:
>>> AttributeSelector(name="class", value="widget")
CSS counterpart can be represented as:
Example
>>> .widget
In case of using regex pattern, re.search is used to match the attribute value.
Example
>>> ClassSelector(re.compile(r"nav"))
Notes
For more information about class attribute, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Class_selectors
- class IdSelector(value: Pattern[str] | str | None = None)[source]
Bases:
SpecificAttributeSelectorSpecific AttributeSelector for matching elements based on ‘id’ attribute value.
Example
>>> IdSelector("main")
matches all elements that have ‘id’ attribute with value “main”.
Example
>>> <div id="main">Hello World</div> ✔️ >>> <div id="content">Hello World</div> ❌
IdSelector is a convenience wrapper for AttributeSelector, thus example above is equivalent to using:
>>> AttributeSelector(name="id", value="main")
CSS counterpart can be represented as:
Example
>>> #main
In case of using regex pattern, re.search is used to match the attribute value.
Example
>>> IdSelector(re.compile(r"content[0-9]+"))
Notes
For more information about id attribute, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/ID_selectors
- class PatternSelector(pattern: Pattern[str] | str)[source]
Bases:
SoupSelectorSelector for finding elements based on text content pattern.
Example
>>> PatternSelector("Hello World")
matches all element with exact text content “Hello World”.
Example
>>> <div>Hello World</div> ✔️ >>> <div>Hello Python</div> ❌ >>> <div>Hello World 3</div> ❌
In case of using regex pattern, re.search is used to match the attribute value.
Example
>>> PatternSelector(re.compile(r"[0-9]+"))
matches all elements with text content containing at least one digit.
Example
>>> <div>Hello World 123</div> ✔️ >>> <div>Hello World</div> ❌
Parameters
- pattern: str | Pattern
Pattern to match text of the element. Can be a string for exact match or Pattern for any more complex regular expressions.
Notes
Element does not match the pattern if it has any children. Only leaf nodes can be returned by PatternSelector find methods.
- pattern: Pattern[str] | str
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- __init__(pattern: Pattern[str] | str) None
- class XPathSelector(xpath: Any)[source]
Bases:
SoupSelectorSelector for finding elements based on XPath expressions.
Examples
>>> selector = XPathSelector("//p[@class='menu']") ... selector.find(soup)
Examples
>>> from lxml.etree import XPath ... selector = XPathSelector(XPath("//p[@class='menu']", smart_strings=False)) ... selector.find(soup)
Expressions must target elements, not attributes or text content.
Examples
>>> selector = XPathSelector("//div//@href") ... selector.find(soup) None
Notes
Equality check includes only xpath expression, as lxml XPath object does not implement more specific __eq__ method.
- __init__(xpath: Any) None[source]
Initializes XPathSelector with a given XPath expression.
Parameters
- xpathstr | lxml.etree.XPath
String representing of xpath expression or compiled XPath object. It needs to target elements, not attributes or text content.
Raises
- InvalidXPathSelector
If the provided XPath string cannot be compiled into XPath object.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class UniversalSelector[source]
Bases:
SoupSelector,SelectableCSSSelector representing a wildcard pattern, that matches all elements in the html page.
Example
>>> UniversalSelector()
CSS counterpart can be represented as:
Example
>>> *
And can be retrieved with css property.
Example
>>> UniversalSelector().css "*"
Notes
For more information on universal selector, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Universal_selectors
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- property css: str
Returns wildcard css selector matching all elements in the markup.
- __init__() None
- class ExpressionSelector(f: Callable[[IElement], bool])[source]
Bases:
SoupSelectorSelector that matches elements based on a user-defined function (predicate), that is used as filter for element object.
Applies predicate to each element and returns those that satisfy the condition.
Parameters
- fCallable[[IElement], bool]
A user-defined function (predicate) that determines whether the element should be selected.
Examples
>>> selector = ExpressionSelector(lambda x: x.name not in {"a", "div"}) ... selector.find(soup)
To perform operations on underlying node, use IElement.get() method or IElement.node attribute.
Example
>>> selector = ExpressionSelector(lambda x: 'widget' in x.node['class'])
For SoupElement object, that wraps bs4.Tag.
Notes
Any exceptions should be handled inside provided function. If raised, it will be propagated to the caller.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class SelfSelector[source]
Bases:
SoupSelectorSelector matching only the element itself. Convenience component that can be used for compatibility.
Example
>>> SelfSelector()
always matches the tag that is passed to the find methods.
Notes
Can be used in user-defined model for scope if element itself is the scope.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class DescendantCombinator(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
BaseCombinatorCounterpart of CSS descendant combinator. Represents the relationship between selectors, where every next matching element is a descendant of the previous one.
Example
>>> DescentCombinator(TypeSelector("div"), ClassSelector("widget"))
matches all descendants of ‘div’ element with ‘widget’ class.
Example
>>> <div><a class="widget"></a></div> ✔️ >>> <div><div><a class="widget"></a></div></div> ✔️ >>> <div><a id="widget"></a></div> ❌ >>> <span><a class="widget"></a></span> ❌ >>> <a class="widget"></a> ❌
Object can be created as well by using right shift operator >> on SoupSelector objects.
Example
>>> TypeSelector("div") >> ClassSelector("widget")
CSS counterpart can be represented as:
Example
>>> div .widget
Notes
For more information on subsequent sibling combinator, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator
- class ChildCombinator(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
BaseCombinatorCounterpart of CSS child combinator. Represents the relationship between selectors, where every next matching element is a direct child of the previous one.
Example
>>> ChildCombinator(TypeSelector("div"), TypeSelector("a"))
matches all ‘a’ elements that are direct children of ‘div’ elements.
Example
>>> <div class="widget"><a>Hello World</a></div> ✔️ >>> <div class="widget"><span></span><a>Hello World</a></div> ✔️ >>> <span class="widget"><a>Hello World</a></span> ❌ >>> <div class="menu"><span>Hello World</span></div> ❌
Object can be created as well by using greater than operator > on SoupSelector objects.
Example
>>> TypeSelector("div") > TypeSelector("a")
Which is equivalent to the first example.
CSS counterpart can be represented as:
Example
>>> div > a { color: red; }
Notes
For more information on child combinator, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Child_combinator
- class NextSiblingCombinator(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
BaseCombinatorCounterpart of CSS next sibling combinator. Represents the relationship between selectors, where every next matching element is a sibling immediately following the previous one.
Example
>>> NextSiblingCombinator(TypeSelector("div"), TypeSelector("a"))
matches all ‘a’ elements that immediately follow ‘div’ elements, it means that both elements are children of the same parent element.
Example
>>> <div class="widget"></div><a>Hello World</a> ✔️ >>> <div class="widget"><a>Hello World</a></div> ❌ >>> <div class="widget"></div><span></span><a>Hello World</a> ❌
Object can be created as well by using plus operator + on SoupSelector objects.
Example
>>> TypeSelector("div") + TypeSelector("a")
Which is equivalent to the first example.
CSS counterpart can be represented as:
Example
>>> div + a
Notes
This is also known as the adjacent sibling combinator in CSS. For more information on next sibling combinator, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Next-sibling_combinator
- class SubsequentSiblingCombinator(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
BaseCombinatorCounterpart of CSS subsequent sibling combinator. Represents the relationship between selectors, where every next matching element is a sibling following the previous one, but not necessarily immediately.
Example
>>> SubsequentSiblingCombinator(TypeSelector("div"), TypeSelector("a"))
matches all ‘a’ elements that follow ‘div’ elements.
Example
>>> <div class="widget"></div><a>Hello World</a> ✔️ >>> <div class="widget"><span></span><a>Hello World</a></div> ✔️ >>> <span class="widget"><a>Hello World</a></span> ❌ >>> <a>Hello World</a><div class="menu"></div> ❌
Object can be created as well by using multiplication operator * on SoupSelector objects.
Example
>>> TypeSelector("div") * TypeSelector("a")
CSS counterpart can be represented as:
Example
>>> div ~ a
Notes
This combinator is also known as general sibling combinator in CSS. For more information on subsequent sibling combinator, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Subsequent-sibling_combinator
- class SelectorList(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
CompositeSoupSelectorCounterpart of CSS selector list. At least one selector from list must match the element to be included.
Example
>>> SelectorList(TypeSelector("a"), AttributeSelector(name="class", value="widget"))
matches all elements that have “a” tag name OR ‘class’ attribute “widget”.
Example
>>> <a>Hello World</a> ✔️ >>> <div class="widget">Hello World</div> ✔️ >>> <div>Hello Python</div> ❌
Object can be created as well by using pipe operator | on SoupSelector objects.
Example
>>> TypeSelector("a") | ClassSelector("widget")
CSS counterpart can be represented as:
Example
>>> a, .widget >>> :is(a, .widget)
Notes
For more information on selector list, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/Selector_list
- __init__(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector) None[source]
Initializes SelectorList object with provided positional arguments.
Parameters
- selectors: SoupSelector
At least two SoupSelector objects to match accepted as positional arguments.
Raises
- NotSoupSelectorException
If any of provided parameters is not an instance of SoupSelector.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class ParentCombinator(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
BaseAncestorCombinatorDefines a relationship between selectors, where every next matching element is a parent of the previous one.
Example
>>> ParentCombinator(TypeSelector("a"), TypeSelector("div"))
The given selector matches all ‘div’ elements that are parents of ‘a’ elements.
Example
>>> <div><a href="/shop"></a></div> ✔️ >>> <div><span><div><a href="/shop"></a></span></div> ❌ >>> <span><a href="/shop"></a></span> ❌
Object can be created as well by using lt operator < on SoupSelector objects.
Example
>>> TypeSelector("a") < TypeSelector("div")
Although this combinator does not have its counterpart in CSS, it can be represented as:
Example
>>> div:has(> a)
- class AncestorCombinator(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
BaseAncestorCombinatorDefines a relationship between selectors, where every next matching element is an ancestor of the previous one.
Example
>>> AncestorCombinator(TypeSelector("a"), TypeSelector("div"))
The given selector matches all ‘div’ elements that are ancestors of ‘a’ elements.
Example
>>> <div><span><a href="/shop"></a></span></div> ✔️ >>> <div><a href="/shop"></a></div> ✔️ >>> <div><span class="menu"></span>/div> ❌ >>> <span><a class="menu"></span>/div> ❌
Object can be created as well by using left shift operator << on SoupSelector objects.
Example
>>> TypeSelector("a") << TypeSelector("div")
Although this combinator does not have its counterpart in CSS, it can be represented as:
Example
>>> div:has(a)
- class AndSelector(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
CompositeSoupSelectorSelector representing an intersection of multiple selectors, where element must be matched by all provided selectors. Counterpart of CSS compound selector.
Example
>>> AndSelector(TypeSelector("div"), ClassSelector("widget"))
matches all elements that have “div” tag name AND ‘class’ attribute “widget”.
Example
>>> <div class="widget">Hello World</div> ✔️ >>> <span class="widget">Hello World</span> ❌ >>> <div class="menu">Hello World</div> ❌
Object can be created as well by using bitwise AND operator & on SoupSelector objects.
Example
>>> TypeSelector("div") & ClassSelector("widget")
CSS counterpart can be represented as:
Example
>>> div.widget
Notes
For more information on compound selectors ,see:
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_selectors/Selector_structure#compound_selector
- __init__(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector) None[source]
Initializes AndSelector object with provided positional arguments as selectors.
Parameters
- selectors: SoupSelector
At least two SoupSelector objects to match accepted as positional arguments.
Raises
- NotSoupSelectorException
If any of provided parameters is not an instance of SoupSelector.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class NotSelector(selector: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
CompositeSoupSelectorSelector for finding elements that do not match provided selector(s). Counterpart of CSS :not() pseudo-class.
Example
>>> NotSelector(TypeSelector("div"))
matches all elements that do not have “div” tag name.
Example
>>> <span> class="widget">Hello World</span> ✔️ >>> <div class="menu">Hello World</div> ❌
Object can be created as well by using bitwise NOT operator ~ on SoupSelector object.
Example
>>> ~TypeSelector("div")
Which is equivalent to the first example.
This is CSS counterpart of :not() pseudo-class.
Example
>>> :not(div)
Notes
For more information on :not() pseudo-class, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/:not
- __init__(selector: SoupSelector, /, *selectors: SoupSelector) None[source]
Initializes NotSelectors object with provided positional arguments as selectors.
Parameters
- selectors: SoupSelector
At least one SoupSelector objects to negate match accepted as positional arguments.
Raises
- NotSoupSelectorException
If any of provided parameters is not an instance of SoupSelector.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- OrSelector
alias of
SelectorList
- class HasSelector(selector: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
CompositeSoupSelectorSelector for finding elements based on matching reference elements.
Example
>>> HasSelector(TypeSelector("div"))
matches all elements that have any descendant with “div” tag name. It uses default combinator of relative selector, which is descendant combinator.
Example
>>> <span><div>Hello World</div></span> ✔️ >>> <span><a>Hello World</a></span> ❌
Other relative selectors can be used with Anchor element.
Example
… HasSelector(Anchor > TypeSelector(“div”)) … HasSelector(Anchor + TypeSelector(“div”))
or by using RelativeSelector components directly:
Example
… HasSelector(RelativeChild(TypeSelector(“div”))) … HasSelector(RelativeNextSibling(TypeSelector(“div”))
Example
>>> <span><div>Hello World</div></span> ✔️ >>> <span><a><div>Hello World</div></a></span> ❌
In this case, HasSelector is anchored against any element, and matches only elements that have “div” tag name as a child.
This is an equivalent of CSS :has() pseudo-class, that matches element if any of the relative selectors that are passed as an argument match element when anchored against it.
Example
>>> :has(div, a) >>> :has(+ div, > a)
These examples translated to soupsavvy would be:
Example
… HasSelector(TypeSelector(“div”), TypeSelector(“a”)) … HasSelector(Anchor + TypeSelector(“div”), Anchor > TypeSelector(“a”))
Notes
Passing RelativeDescendant selector into HasSelector is equivalent to using its selector directly, as descendant combinator is a default option.
Example
>>> HasSelector(RelativeDescendant(TypeSelector("div"))) ... HasSelector(Anchor > TypeSelector("div")) ... HasSelector(TypeSelector("div"))
Three of the above examples are equivalent.
For more information on :has() pseudo-class, see:
https://developer.mozilla.org/en-US/docs/Web/CSS/:has
- __init__(selector: SoupSelector, /, *selectors: SoupSelector) None[source]
Initializes HasSelector object with provided positional arguments as selectors.
Parameters
- selectors: SoupSelector
SoupSelector objects to match accepted as positional arguments. At least one selector is required to create HasSelector.
Raises
- NotSoupSelectorException
If any of provided parameters is not an instance of SoupSelector.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class XORSelector(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector)[source]
Bases:
CompositeSoupSelectorSelector representing an exclusive OR of multiple selectors, where element must be matched by exactly one of them.
Example
>>> XORSelector(TypeSelector("div"), ClassSelector("widget"))
Matches all elements that have either “div” tag name or ‘class’ attribute “widget”. Elements with both “div” tag name and ‘class’ attribute “widget” do not match selector.
Object can be created as well by using xor operator ^ (caret) on SoupSelector objects.
Example
>>> TypeSelector("div") ^ ClassSelector("widget")
This is a shortcut for defining XOR operation between two selectors like this:
Example
>>> selector1 = TypeSelector("div") ... selector2 = ClassSelector("widget") ... xor = (selector1 & (~selector2)) | ((~selector1) & selector2)
- __init__(selector1: SoupSelector, selector2: SoupSelector, /, *selectors: SoupSelector) None[source]
Initializes XORSelector object with provided positional arguments as selectors.
Parameters
- selectors: SoupSelector
At least two SoupSelector objects to match accepted as positional arguments.
Raises
- NotSoupSelectorException
If any of provided parameters is not an instance of SoupSelector.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class AnyTagSelector(**kwargs)[source]
Bases:
UniversalSelectorAlias for UniversalSelector class. Deprecated component.
- class NthOfSelector(selector: SoupSelector, nth: str)[source]
Bases:
BaseNthOfSelectorSelector for finding nth-of elements in the soup among elements that match provided SoupSelector instance.
Example
>>> selector = NthOfSelector(ClassSelector("item"), "2n+1")
matches all odd elements with class “item”.
Example
>>> <div class="item">1</div> ✔️ ... <div id="item"></div> ❌ ... <div class="item">2</div> ❌ ... <div class="item">3</div> ✔️ ... <div class="widget"></div> ❌ ... <div class="item">4</div> ❌
Notes
For more information about standard :nth-of-type pseudo-class, visit:
https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-of-type
- class NthLastOfSelector(selector: SoupSelector, nth: str)[source]
Bases:
BaseNthOfSelectorSelector for finding nth-last-of elements in the soup among elements that match provided SoupSelector instance.
Example
>>> selector = NthLastOfSelector(ClassSelector("item"), "2n+1")
matches all odd elements with class “item” starting from the last element.
Example
>>> <div class="item">1</div> ❌ ... <div id="item"></div> ❌ ... <div class="item">2</div> ✔️ ... <div class="item">3</div> ❌ ... <div class="widget"></div> ❌ ... <div class="item">4</div> ✔️
Notes
For more information about standard :nth-of-type pseudo-class, visit:
https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-last-of-type
- class OnlyOfSelector(selector: SoupSelector)[source]
Bases:
SoupSelectorSelector for finding the only element, that matches provided SoupSelector instance among its siblings.
Example
>>> selector = OnlyOfSelector(ClassSelector("item"))
matches all elements with class “item” that are the only child of their parent that matches the selector.
Example
>>> <div><div class="item"></div><a class="item"></a></div> ❌ >>> <div><div class="item"></div><a class="widget"></a></div> ✔️ >>> <div><div class="item"></div></div> ✔️ >>> <div><div class="widget"></div></div> ❌
Notes
For more information about standard :only-of-type pseudo-class, visit:
https://developer.mozilla.org/en-US/docs/Web/CSS/:only-of-type
- __init__(selector: SoupSelector) None[source]
Initializes OnlyOfSelector instance.
Parameters
- selectorSoupSelector
Any SoupSelector instance used to match elements.
Raises
- NotSoupSelectorException
If selector is not an instance of SoupSelector.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.