Attention

The new major version is out!
The library was renamed to adaptix due to extending of the working scope.

This update features:

  1. Support for model-to-model conversion.

  2. Support for attrs and sqlalchemy (integration with many other libraries is coming).

  3. Fully redesigned API helping to follow DRY.

  4. Performance improvements of up to two times.

Quickstart

Dataclass factory analyzes your type hints and generates corresponding parsers based on retrieved information. For dataclasses it checks what fields are declared and then calls normal constructor. For others types behavior can differ.

Also you can configure it using miscellaneous schemas (see Extended usage).

Installation

Just use pip to install the library:

pip install dataclass_factory

Simple case

All you have to do to start parsing you dataclasses is create a Factory instance. Then call load or dump methods with corresponding type and everything is done automatically.

from dataclasses import dataclass
import dataclass_factory


@dataclass
class Book:
    title: str
    price: int
    author: str = "Unknown author"


data = {
    "title": "Fahrenheit 451",
    "price": 100,
}

factory = dataclass_factory.Factory()
book: Book = factory.load(data, Book)  # Same as Book(title="Fahrenheit 451", price=100)
serialized = factory.dump(book)

All typing information is retrieved from you annotations, so it is not required from you to provide any schema or even change your dataclass decorators or class bases.

In provided example book.author == "Unknown author" because normal dataclass constructor is called.

It is better to create factory only once, because all parsers are cached inside it after first usage. Otherwise, the structure of your classes will be analysed again and again for every new instance of Factory.

Nested objects

Nested objects are supported out of the box. It is surprising, but you do not have to do anything except defining your dataclasses. For example, your expect that author of Book is instance of Person, but in serialized form it is dictionary.

Declare your dataclasses as usual and then just parse your data.

from dataclasses import dataclass

import dataclass_factory


@dataclass
class Person:
    name: str


@dataclass
class Book:
    title: str
    price: int
    author: Person


data = {
    "title": "Fahrenheit 451",
    "price": 100,
    "author": {
        "name": "Ray Bradbury"
    }
}

factory = dataclass_factory.Factory()

# Book(title="Fahrenheit 451", price=100, author=Person("Ray Bradbury"))
book: Book = factory.load(data, Book)
serialized = factory.dump(book)

Lists and other collections

Want to parse collection of dataclasses? No changes required, just specify correct target type (e.g List[SomeClass] or Dict[str, SomeClass]).

from typing import List

from dataclasses import dataclass

import dataclass_factory


@dataclass
class Book:
    title: str
    price: int


data = [
    {
        "title": "Fahrenheit 451",
        "price": 100,
    },
    {
        "title": "1984",
        "price": 100,
    }
]

factory = dataclass_factory.Factory()
books = factory.load(data, List[Book])
serialized = factory.dump(books)

Fields also can contain any supported collections.

Error handling

Currently parser doesn’t throw any specific exception in case of parser failure. Errors are the same as thrown by corresponding constructors. In normal cases all suitable exceptions are described in dataclass_factory.PARSER_EXCEPTIONS

from dataclasses import dataclass

import dataclass_factory


@dataclass
class Book:
    title: str
    price: int
    author: str = "Unknown author"


data = {
    "title": "Fahrenheit 451"
}

factory = dataclass_factory.Factory()

try:
    book: Book = factory.load(data, Book)
except dataclass_factory.PARSER_EXCEPTIONS as e:
    # Cannot parse:  <class 'TypeError'> __init__() missing 1 required positional argument: 'price'
    print("Cannot parse: ", type(e), e)

Validation

Validation of data can be done in two cases:

  • per-field validations

  • whole structure validation

In first case you can use @validate decorator with field name to check the data.

Here are details:

  • validator can be called before parsing field data (set pre=True) or after it.

  • field validators are applied after all name transformations (name styles, structure flattening). So use field name as it is called in your class

  • validator can be applied to multiple fields. Just provide multiple names

  • validator can be applied to any field separately. Just do not set any field name

  • validator must return data if checks are succeeded. Data can be same as passed to it or anything else. Validator CAN change data

  • field validators cannot be set in default schema

  • multiple decorators can be used on single method

  • multiple validators can be applied to single field

  • use different method names to prevent overwriting

from dataclasses import dataclass

from dataclass_factory import validate, Factory, Schema, NameStyle


class MySchema(Schema):
    SOMETHING = "Some string"

    @validate("int_field")  # use original field name in class
    def validate_field(self, data):
        if data > 100:
            raise ValueError
        return data * 100  # validator can change value

    # this validator will be called before parsing field
    @validate("complex_field", pre=True)
    def validate_field_pre(self, data):
        return data["value"]

    @validate("info")
    def validate_stub(self, data):
        return self.SOMETHING  # validator can access schema fields


@dataclass
class My:
    int_field: int
    complex_field: int
    info: str


factory = Factory(schemas={
    My: MySchema(name_style=NameStyle.upper_snake)  # name style does not affect how validators are bound to fields
})

result = factory.load({"INT_FIELD": 1, "COMPLEX_FIELD": {"value": 42}, "INFO": "ignored"}, My)
assert result == My(100, 42, "Some string")

If you want to check whole structure, your can any check in pre_parse or post_parse step. Idea is the same:

  • pre_parse is called before structure parsing is done (but even before data is flattened and names are processed).

  • post_parse is called after successful parsing

  • only one pre_parse and one post_parse methods can be in class.

from dataclasses import dataclass

import dataclass_factory
from dataclass_factory import Schema


@dataclass
class Book:
    title: str
    price: int
    author: str = "Unknown author"


data = {
    "title": "Fahrenheit 451",
    "price": 100,
}
invalid_data = {
    "title": "1984",
    "price": -100,
}


def validate_book(book: Book) -> Book:
    if not book.title:
        raise ValueError("Empty title")
    if book.price <= 0:
        raise ValueError("InvalidPrice")
    return book


factory = dataclass_factory.Factory(schemas={Book: Schema(post_parse=validate_book)})
book = factory.load(data, Book)  # No error
other_book = factory.load(invalid_data, Book)  # ValueError: InvalidPrice