operations.selection_pipeline

Module with selection pipeline class. Pipeline for chaining selector and operation together, used as a bridge between selecting html elements and processing the data.

class SelectionPipeline(selector: TagSearcher | TagSearcherMeta, operation: BaseOperation)[source]

Bases: TagSearcher, Comparable

Class for chaining searcher and operation together. Uses searcher to find information in element and operation to process the data.

Example

>>> from soupsavvy import TypeSelector
... from soupsavvy.operations import Operation, Text
... pipeline = TypeSelector("span") | Text()
... pipeline.find(soup)
'information'

Most common way of creating a pipeline is using the | operator on selector and operation.

__init__(selector: TagSearcher | TagSearcherMeta, operation: BaseOperation) None[source]

Initializes SelectionPipeline with selector and operation.

Parameters

selectorTagSearcher

Selector used for finding target information in the element.

operationBaseOperation

Operation used for processing the data.

Raises

NotTagSearcherException

If provided selector is not a valid TagSearcher instance.

NotOperationException

If provided operation is not a valid BaseOperation instance.

property selector: TagSearcher

Returns TagSearcher object of this pipeline used for finding target information in the element.

Returns

TagSearcher

TagSearcher object used in this pipeline.

property operation: BaseOperation

Returns BaseOperation object of this pipeline used for processing the data.

Returns

BaseOperation

BaseOperation object used in this pipeline.

find(tag: IElement, strict: bool = False, recursive: bool = True) Any[source]

Finds a first element matching selector and processes it with operation.

Parameters

tagIElement

Any IElement object to process.

strictbool, optional

If True, enforces results to be found in the element, by default False.

recursivebool, optional

Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.

Returns

Any

Result of the operation applied to the found element.

Raises

TagNotFoundException

If strict parameter is set to True and none matching element was found.

FailedOperationExecution

If operation execution failed on the found element.

find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[Any][source]

Finds all elements matching selector and processes them with operation.

Parameters

tagIElement

Any IElement object to process.

recursivebool, optional

Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.

limitint, optional

Specifies maximum number of results to return in a list. By default None, everything is returned.

Returns

list[Any]

A list of results, if none found, the list is empty.

Raises

FailedOperationExecution

If operation execution failed on any of the found elements.