base
Module for base classes used across the package. Introduces another layer of abstraction for operations and selectors.
- check_selector(x: Any, message: str | None = None) SoupSelector[source]
Checks if provided object is a valid soupsavvy selector. Checks for instance of SoupSelector and raises an exception if not. Returns provided object if fulfills the condition for convenience.
Parameters
- xAny
Any object to be validated as correct selector.
- messagestr, optional
Custom message to be displayed in case of raising an exception. By default None, which results in default message.
Raises
- NotSoupSelectorException
If provided object is not an instance of SoupSelector.
- check_operation(x: Any, message: str | None = None) BaseOperation[source]
Checks if provided object is a valid soupsavvy operation. Checks for instance of BaseOperation and raises an exception if not. Returns provided object if fulfills the condition for convenience.
Parameters
- xAny
Any object to be validated as correct operation.
- messagestr, optional
Custom message to be displayed in case of raising an exception. By default None, which results in default message.
Raises
- NotOperationException
If provided object is not an instance of BaseOperation.
- check_tag_searcher(x: Any, message: str | None = None) TagSearcher[source]
Checks if provided object is a valid soupsavvy TagSearcher. Checks for instance of TagSearcher or other compatible type like Model class. Returns provided object if fulfills the condition for convenience.
Parameters
- xAny
Any object to be validated as correct TagSearcher.
- messagestr, optional
Custom message to be displayed in case of raising an exception. By default None, which results in default message.
Raises
- NotTagSearcherException
If provided object is not an instance of TagSearcher or any other compatible type.
- class SoupSelector[source]
Bases:
TagSearcher,ComparableBase class for all soupsavvy selectors, that define declarative search procedure of searching for matching nodes in the html element.
Selectors can be combined with other selectors to create search procedures. They can be chained with operations to extract and transform the data.
Methods
find
Finds first element matching selector in provided element. If no element is found, returns None by default, or raises an exception if strict mode is enabled. Additionally recursive parameter can be set to search only direct children.
find_all
Finds all elements matching selector in provided element and returns them in a list. Additionally limit and recursive parameters can be set.
Notes
- Specific selector inheriting from this class, need to implement:
find_all method that returns a list of matching elements.
__eq__ method to compare two selectors for equality.
Optionally find method can be implemented to return first matching element,
but, by default, it uses find_all under the hood.
- find(tag: IElement, strict: Literal[False] = False, recursive: bool = True) IElement | None[source]
- find(tag: IElement, strict: Literal[True] = False, recursive: bool = True) IElement
- find(tag: IElement, strict: bool = False, recursive: bool = True) IElement | None
Finds the first matching element in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- strictbool, optional
If True, raises an exception if element was not found in markup, if False and element was not found, returns None. Value of this parameter does not affect behavior if element was successfully found. By default False.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
Returns
- IElement | None
First IElement` object matching selector or None if none matching.
Raises
- TagNotFoundException
If strict parameter is set to True and none matching element was found.
- abstract find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[IElement][source]
Finds all elements matching selector in provided IElement.
Parameters
- tagIElement
Any IElement object to search within.
- recursivebool, optional
Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.
- limitint, optional
Specifies maximum number of elements to return. By default None, all found elements are returned.
Returns
- list[IElement]
List of IElement objects matching selector. If none found, the list is empty.
- class SelectableCSS[source]
Bases:
ABCInterface for selectors, that can clearly and unambiguously defined css selector, used to search for elements, that matches the same elements as find methods.
Notes
To implement SelectableCSS interface, child class must implement: - ‘selector’ property, which return a string representing css selector.
- abstract property css: str
Returns string representing element css selector.
- property selector: str
Returns string representing element css selector.
- class CompositeSoupSelector(selectors: Iterable[SoupSelector])[source]
Bases:
SoupSelectorInterface for selectors consisting of multiple selectors.
Notes
To implement CompositeSoupSelector interface, child class must call its init method with provided selectors to set up the object.
Attributes
- selectorslist[SoupSelector]
List of SoupSelector objects used for searching elements.
- COMMUTATIVE = True
- __init__(selectors: Iterable[SoupSelector]) None[source]
Initializes composite selector object with provided selectors. Checks if all selectors are instances of SoupSelector.
Parameters
- selectors: Iterable[SoupSelector]
Selectors used to search for elements.
Raises
- NotSoupSelectorException
If any of provided parameters is not an instance of SoupSelector.
- property selectors: list[SoupSelector]
Returns a list of selectors that composite selector consists of.
Returns
- list[SoupSelector]
List of SoupSelector objects used for searching elements.
- class BaseOperation[source]
Bases:
Executable,ComparableBase class for all soupsavvy operations. Operations are used to process the selection results from the soup, extract and transform the data.
Operations can be chained together using the pipe operator ‘|’.
Example
>>> from soupsavvy.operations import Operation ... operation = Operation(str.lower) | Operation(str.strip) ... operation.execute(" TEXT ") 'text'
Operations can be combined with selectors to extract and transform target information.
Example
>>> from soupsavvy import TypeSelector ... from soupsavvy.operations import Operation, Text ... selector = TypeSelector("div") | Text() | Operation(int) ... selector.find(soup) 42
BaseOperation inherits from Comparable interface, __eq__ method needs to be implemented in derived classes.
- execute(arg: Any) Any[source]
Execute the operation on the given argument and return the result.
Parameters
- argAny
Argument to be processed by the operation.
Returns
- Any
Result of the operation.
Raises
- BreakOperationException
If operation execution should be interrupted and propagated to caller.
- FailedOperationExecution
If operation execution fails for any other reason.
- class OperationSearcherMixin[source]
Bases:
BaseOperation,TagSearcherMixin of BaseOperation and TagSearcher interfaces. Allows operations to be used as field searchers in model to perform operation directly on scope element.
- find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[Any][source]
Processes provided element and returns the result in a list.
Parameters
- tagIElement
Any IElement object to process.
- recursivebool, optional
Ignored, for consistency with interface.
- limitint, optional
Ignored, for consistency with interface.
Returns
- list[Any]
Result of applied operation on element in a list.
- find(tag: IElement, strict: bool = False, recursive: bool = True) Any[source]
Processes provided element and returns the result.
Parameters
- tagIElement
Any IElement object to process.
- strictbool, optional
Ignored, for consistency with interface.
- limitint, optional
Ignored, for consistency with interface.
Returns
- Any
Result of applied operation on element.
- class BrowserOperation[source]
Bases:
BaseOperationBase class for operations that act on a IBrowser interface.
Browser operations are designed to perform actions with objects implementing the IBrowser interface. It validates that input argument to execute method is of this type. If operation returns value, it is passed through, otherwise the original IBrowser instance is returned.
As standard operations, browser operations can be chained together using the pipe operator ‘|’.
Operations can be combined with selectors to extract and transform target information. Chaining other types of operations might result in errors.
Each derived operation class needs to implement __eq__ method.
- execute(arg: IBrowser) Any[source]
Execute the operation on the given argument and return the result.
Parameters
- argAny
Argument to be processed by the operation.
Returns
- Any
Result of the operation.
Raises
- BreakOperationException
If operation execution should be interrupted and propagated to caller.
- FailedOperationExecution
If operation execution fails for any other reason.
- class ElementAction[source]
Bases:
ComparableAbstract base class for actions that use browser to interact with an element.
Actions perform operation on element in dynamic context, it requires both browser context (IBrowser) and the target element active in browser context (IElement).
ElementActions are not typical operations, so they cannot be chained with other soupsavvy operations. They are intended to be used within ApplyTo operation.
Example
>>> from soupsavvy.browser.operations import ApplyTo, Click ... from soupsavvy import TypeSelector ... from soupsavvy.implementation.selenium import SeleniumBrowser ... from selenium import webdriver ... ... browser = SeleniumBrowser(webdriver.Chrome()) ... action = Click() ... selector = TypeSelector('button') ... operation = ApplyTo(selector, action) ... operation.execute(browser)
ElementAction inherits from Comparable interface, __eq__ method needs to be implemented in derived classes.