models.base

Module with base model schema and its configuration.

Classes

  • BaseModel - Base class for defining search schemas.

  • Field - Field configuration for search schema.

  • MigrationSchema - Configuration of model migration.

class MigrationSchema(target: ~typing.Type, params: dict = <factory>)[source]

Bases: object

Defines schema for model migration in need of providing additional parameters to the target model initialization in nested models.

Attributes

targetType

Target model class to migrate to.

paramsdict, optional

Additional parameters to pass to the target model initialization.

target: Type
params: dict
__init__(target: ~typing.Type, params: dict = <factory>) None
post(field: str) Callable[[Callable], Callable][source]

Decorator to mark a method as a post-processor for a model field. The method will be called after the field is extracted from the element in model instance initialization.

Example

>>> class MyModel(BaseModel):
...    ...
...    field = ...
...
...    @post("field")
...    def post_process_field(self, value):
...        return value.strip()

Methods of custom model class, that are decorated with @post decorator, must accept only one argument, which is the value of the field to be processed.

serializer(field: str) Callable[[Callable], Callable][source]

Decorator to mark a method as a serializer for a model field. The method will be called to serialize the field value of the model to json format in json method.

Example

>>> class MyModel(BaseModel):
...    ...
...    field = ...
...
...    @serializer("field")
...    def serialize_field(self, value):

… return json.dumps(value)

Methods of custom model class, that are decorated with @serializer decorator, must accept only one argument, which is the value of the field to be serialized.

class Field(selector: TagSearcher | TagSearcherMeta, repr: bool = True, compare: bool = True, migrate: bool = True)[source]

Bases: TagSearcher, Comparable

Model field wrapper, that defined field metadata. Used for overwriting default behavior of attribute corresponding to field in model instance, similarly to dataclass field function.

Attributes

selectorTagSearcher | type[BaseModel]

Any searcher used in model as field.

reprbool, optional

Whether the field should be included in the model’s representation. Default is True.

comparebool, optional

Whether the field should be included in the model’s equality comparison as well as in the hash calculation. Default is True.

migratebool, optional

Whether the field should be migrated to the target model in model migration. Default is True.

Example

>>> class MyModel(BaseModel):
...    __scope__ = TypeSelector("p")
...
...    price = Text() | Operation(int)
...    element = Field(SelfSelector(), repr=False, compare=False, migrate=False)

In this example, only price is relevant for model object. Element itself is just for reference and should not be included in model representation, comparison or migration.

Using Field wrapper without any additional arguments is equivalent to default behavior.

selector: TagSearcher | TagSearcherMeta
repr: bool = True
compare: bool = True
migrate: bool = True
find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[Any][source]

Processes IElement object and returns list of results.

Parameters

tagIElement

Any IElement object to process.

recursivebool, optional

Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.

limitint, optional

Specifies maximum number of results to return in a list. By default None, everything is returned.

Returns

list[Any]

A list of results from processed element.

find(tag: IElement, strict: bool = False, recursive: bool = True) Any[source]

Processes IElement object and returns result.

Parameters

tagIElement

Any IElement object to process.

strictbool, optional

If True, enforces results to be found in the element, by default False.

recursivebool, optional

Specifies if search should be recursive. If set to False, only direct children of the element will be searched. By default True.

Returns

Any

Processed result from the element.

__init__(selector: TagSearcher | TagSearcherMeta, repr: bool = True, compare: bool = True, migrate: bool = True) None
class ModelMeta(name, bases, namespace, /, **kwargs)[source]

Bases: TagSearcherMeta

Metaclass for all models derived from BaseModel. This metaclass ensures that certain attributes and methods are defined and properly configured in each model class.

It handles validation of provided attributes and controls inheritance of fields. Meta inherits from type(ABC) to avoid metaclass conflicts.

Attributes

scopeSoupSelector

Returns the __scope__ attribute, which defines the scope within which the model is to be found.

fieldsdict[str, TagSearcher]

Fields that defines the model and search operations.

__init__(name, bases, class_dict)[source]

Initializes the model class and validates its attributes. For each user-defined model, which is a subclass of BaseModel, checks if scope and fields are properly defined.

Raises

ScopeNotDefinedException

If the scope attribute is missing in the model class.

FieldsNotDefinedException

If no fields are defined in the model class.

property scope: SoupSelector

Returns the __scope__ attribute, which defines the scope selector for the model.

Returns

SoupSelector

The scope selector used to identify the element in which the model is searched.

property fields: dict[str, Field]

Returns the fields of the model class with their respective TagSearcher instances.

Returns

dict[str, Field]

A dictionary mapping field names to their respective Field instances.

class BaseModel(**kwargs)[source]

Bases: TagSearcher, Comparable, JSONSerializable

Base class for all user-defined models in soupsavvy.

__init__(**kwargs) None[source]

Initializes a model instance with provided field values. Model should not be initialized directly, but through the find methods.

Parameters

kwargsAny

Field values to initialize the model with provided as keyword arguments.

Raises

MissingFieldsException

If any required fields are missing from the provided kwargs.

UnknownModelFieldException

If any unknown fields are provided in the kwargs.

property attributes: dict[str, Any]

Returns a dictionary of model instance attributes representing model fields and their respective values.

Returns

dict[str, Any]

A dictionary mapping model field names to their respective values.

classmethod find(tag: IElement, strict: Literal[True] = False, recursive: bool = True) Self[source]
classmethod find(tag: IElement, strict: Literal[False] = False, recursive: bool = True) Self | None
classmethod find(tag: IElement, strict: bool = False, recursive: bool = True) Self | None

Searches for and returns an instance of the model within the provided element. By default, perform recursive, non-strict search for model fields within the scope element.

Parameters

tagIElement

Any IElement object to search within for the model.

strictbool, optional

If True, enforces model scope to be found in the element.

recursivebool, optional

Whether the search for the model scope element should be recursive. Default is True.

Returns

Self | None

An instance of the model if found, otherwise None.

Raises

ModelNotFoundException

If the model’s scope is not found and strict is True.

FieldExtractionException

If any model field failed to be extracted.

classmethod find_all(tag: IElement, recursive: bool = True, limit: int | None = None) list[Self][source]

Searches for and returns all instances of the model within the provided element. By default, perform recursive, non-strict search for model fields just like in find method.

Parameters

tagIElement

Any IElement object to search within for the model.

recursivebool, optional

Whether the search for the model scope element should be recursive. Default is True.

limitint, optional

Maximum number of model instances to return. Default is None, which returns all instances found.

Returns

list[Self]

A list of model instances found within the element.

migrate(model: Type[T], mapping: dict[Type[BaseModel], Type | MigrationSchema] | None = None, **kwargs) T[source]

Migrates the model instance to another model class using its fields in target class initialization. Recursively migrates nested models, creating new instances, even when target model is not defined in the mapping.

Parameters

modelType[Model]

The target model class to migrate the instance to.

mappingdict[Type[BaseModel], Union[Type, MigrationSchema]], optional

Mapping of base model fields to target models. By default, if field is instance of BaseModel, it will be passed directly to the target model.

kwargsAny

Additional keyword arguments to pass to model initialization.

Migrating to the same model is equivalent to creating deep copy of the model, which can be achieved by calling copy method.

Returns

Model

An instance of the target model class.

copy() Self[source]

Creates a deep copy of the model instance by migrating it to the same model class. Only model fields defined in attributes are used to create new instance.

Returns

Self

A deep copy of the model instance.

json() dict[source]

Converts the model instance to a JSON-serializable dictionary.

Returns

dict

A json-serializable representation of the model instance.