models

Submodules

Content

Package with models for defining search schemas with operations and selectors.

Classes

  • BaseModel - Base class for defining search schemas.

  • Field - Field configuration for search schema.

  • MigrationSchema - Configuration of model migration.

  • All - Wrapper to find all information matching criteria.

  • Default - Wrapper to set default value if information is not found.

  • Required - Wrapper to enforce that information must be found.

class BaseModel(**kwargs)[source]

Bases: TagSearcher, Comparable, JSONSerializable

Base class for all user-defined models in soupsavvy.

__init__(**kwargs) None[source]

Initializes a model instance with provided field values. Model should not be initialized directly, but through the find methods.

Parameters

kwargsAny

Field values to initialize the model with provided as keyword arguments.

Raises

MissingFieldsException

If any required fields are missing from the provided kwargs.

UnknownModelFieldException

If any unknown fields are provided in the kwargs.

property attributes: dict[str, Any]

Returns a dictionary of model instance attributes representing model fields and their respective values.

Returns

dict[str, Any]

A dictionary mapping model field names to their respective values.

classmethod find(tag: IElement, strict: Literal[True] = False, recursive: bool = True) Self[source]
classmethod find(tag: IElement, strict: Literal[False] = False, recursive: bool = True) Self | None
classmethod find(tag: IElement, strict: bool = False, recursive: bool = True) Self | None

Searches for and returns an instance of the model within the provided element. By default, perform recursive, non-strict search for model fields within the scope element.

Parameters

tagIElement

Any IElement object to search within for the model.

strictbool, optional

If True, enforces model scope to be found in the element.

recursivebool, optional

Whether the search for the model scope element should be recursive. Default is True.

Returns

Self | None

An instance of the model if found, otherwise None.

Raises

ModelNotFoundException

If the model’s scope is not found and strict is True.

FieldExtractionException

If any model field failed to be extracted.

classmethod find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[Self][source]

Searches for and returns all instances of the model within the provided element. By default, perform recursive, non-strict search for model fields just like in find method.

Parameters

tagIElement

Any IElement object to search within for the model.

recursivebool, optional

Whether the search for the model scope element should be recursive. Default is True.

limitint, optional

Maximum number of model instances to return. Default is None, which returns all instances found.

Returns

list[Self]

A list of model instances found within the element.

migrate(model: Type[T], mapping: dict[Type[BaseModel], Type | MigrationSchema] | None = None, **kwargs) T[source]

Migrates the model instance to another model class using its fields in target class initialization. Recursively migrates nested models, creating new instances, even when target model is not defined in the mapping.

Parameters

modelType[Model]

The target model class to migrate the instance to.

mappingdict[Type[BaseModel], Union[Type, MigrationSchema]], optional

Mapping of base model fields to target models. By default, if field is instance of BaseModel, it will be passed directly to the target model.

kwargsAny

Additional keyword arguments to pass to model initialization.

Migrating to the same model is equivalent to creating deep copy of the model, which can be achieved by calling copy method.

Returns

Model

An instance of the target model class.

copy() Self[source]

Creates a deep copy of the model instance by migrating it to the same model class. Only model fields defined in attributes are used to create new instance.

Returns

Self

A deep copy of the model instance.

json() dict[source]

Converts the model instance to a JSON-serializable dictionary.

Returns

dict

A json-serializable representation of the model instance.

class All(selector: TagSearcher | TagSearcherMeta)[source]

Bases: FieldWrapper

Field wrapper for selecting multiple elements matching the selector. Forces find method to fall back to find_all method and return all matches.

Example

>>> from soupsavvy.models import All
... from soupsavvy import TypeSelector
... selector = All(TypeSelector("div"))
... selector.find(tag)
[element1, element2, element3]
find(tag: IElement, strict: bool = False, recursive: bool = True) list[Any][source]

Find all matching tags using the wrapped selector, enforcing the use of find_all method.

Parameters

tagIElement

Any IElement to search within.

strictbool, optional

Ignored, as this method always falls back to find_all.

recursivebool, optional

Whether to search recursively, by default True.

Returns

list[Any]

A list of matching results.

class Default(selector: TagSearcher | TagSearcherMeta, default: Any)[source]

Bases: FieldWrapper

Field wrapper for returning a default value if no match is found.

Example

>>> from soupsavvy.models import Default
... from soupsavvy import TypeSelector
... selector = Default(TypeSelector("div"), default="1234")
... selector.find(tag)
"1234"
__init__(selector: TagSearcher | TagSearcherMeta, default: Any) None[source]

Initializes Default field wrapper.

Parameters

selectorTagSearcher

Object compatible with TagSearcher interface to be wrapped.

defaultAny

The default value to return if no match is found.

find(tag: IElement, strict: bool = False, recursive: bool = True)[source]

Finds an element, returning a default value if None was returned by wrapped selector. Any exception raised during the search is propagated.

Parameters

tagIElement

Any IElement to search within.

strictbool, optional

If True, raises an exception if no matches are found, by default False.

recursivebool, optional

Whether to search recursively, by default True.

Returns

Any

The found element or the default value if not found.

class Required(selector: TagSearcher | TagSearcherMeta)[source]

Bases: FieldWrapper

Field wrapper for enforcing matched element not to be None. Raises an exception if searcher does not find any matches.

Example

>>> from soupsavvy.models import Required
... from soupsavvy import TypeSelector
... selector = Required(TypeSelector("div"))
... selector.find(tag)
RequiredConstraintException
find(tag: IElement, strict: bool = False, recursive: bool = True) Any[source]

Finds a required element using the wrapped selector, enforcing matched element not to be None. If any exception is raised during the search, it’s propagated to the caller.

Parameters

tagIElement

Any IElement to search within.

strictbool, optional

If True, raises an exception if no matches are found, by default False.

recursivebool, optional

Whether to search recursively, by default True.

Returns

Any

The found element.

Raises

RequiredConstraintException

If selector returns None, indicating that required element was not found.

post(field: str) Callable[[Callable], Callable][source]

Decorator to mark a method as a post-processor for a model field. The method will be called after the field is extracted from the element in model instance initialization.

Example

>>> class MyModel(BaseModel):
...    ...
...    field = ...
...
...    @post("field")
...    def post_process_field(self, value):
...        return value.strip()

Methods of custom model class, that are decorated with @post decorator, must accept only one argument, which is the value of the field to be processed.

serializer(field: str) Callable[[Callable], Callable][source]

Decorator to mark a method as a serializer for a model field. The method will be called to serialize the field value of the model to json format in json method.

Example

>>> class MyModel(BaseModel):
...    ...
...    field = ...
...
...    @serializer("field")
...    def serialize_field(self, value):

… return json.dumps(value)

Methods of custom model class, that are decorated with @serializer decorator, must accept only one argument, which is the value of the field to be serialized.

class Field(selector: TagSearcher | TagSearcherMeta, repr: bool = True, compare: bool = True, migrate: bool = True)[source]

Bases: TagSearcher, Comparable

Model field wrapper, that defined field metadata. Used for overwriting default behavior of attribute corresponding to field in model instance, similarly to dataclass field function.

Attributes

selectorTagSearcher | type[BaseModel]

Any searcher used in model as field.

reprbool, optional

Whether the field should be included in the model’s representation. Default is True.

comparebool, optional

Whether the field should be included in the model’s equality comparison as well as in the hash calculation. Default is True.

migratebool, optional

Whether the field should be migrated to the target model in model migration. Default is True.

Example

>>> class MyModel(BaseModel):
...    __scope__ = TypeSelector("p")
...
...    price = Text() | Operation(int)
...    element = Field(SelfSelector(), repr=False, compare=False, migrate=False)

In this example, only price is relevant for model object. Element itself is just for reference and should not be included in model representation, comparison or migration.

Using Field wrapper without any additional arguments is equivalent to default behavior.

selector: TagSearcher | TagSearcherMeta
repr: bool = True
compare: bool = True
migrate: bool = True
find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[Any][source]

Processes IElement object and returns list of results.

Parameters

tagIElement

Any IElement object to process.

recursivebool, optional

Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.

limitint, optional

Specifies maximum number of results to return in a list. By default None, everything is returned.

Returns

list[Any]

A list of results from processed element.

find(tag: IElement, strict: bool = False, recursive: bool = True) Any[source]

Processes IElement object and returns result.

Parameters

tagIElement

Any IElement object to process.

strictbool, optional

If True, enforces results to be found in the element, by default False.

recursivebool, optional

Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.

Returns

Any

Processed result from the element.

__init__(selector: TagSearcher | TagSearcherMeta, repr: bool = True, compare: bool = True, migrate: bool = True) None
class MigrationSchema(target: ~typing.Type, params: dict = <factory>)[source]

Bases: object

Defines schema for model migration in need of providing additional parameters to the target model initialization in nested models.

Attributes

targetType

Target model class to migrate to.

paramsdict, optional

Additional parameters to pass to the target model initialization.

target: Type
params: dict
__init__(target: ~typing.Type, params: dict = <factory>) None