Attention
This update features:
Support for model-to-model conversion.
Support for attrs and sqlalchemy (integration with many other libraries is coming).
Fully redesigned API helping to follow DRY.
Performance improvements of up to two times.
Quickstart
Dataclass factory analyzes your type hints and generates corresponding parsers based on retrieved information. For dataclasses it checks what fields are declared and then calls normal constructor. For others types behavior can differ.
Also you can configure it using miscellaneous schemas (see Extended usage).
Installation
Just use pip to install the library:
pip install dataclass_factory
Simple case
All you have to do to start parsing you dataclasses is create a Factory instance.
Then call load
or dump
methods with corresponding type and everything is done automatically.
from dataclasses import dataclass
import dataclass_factory
@dataclass
class Book:
title: str
price: int
author: str = "Unknown author"
data = {
"title": "Fahrenheit 451",
"price": 100,
}
factory = dataclass_factory.Factory()
book: Book = factory.load(data, Book) # Same as Book(title="Fahrenheit 451", price=100)
serialized = factory.dump(book)
All typing information is retrieved from you annotations, so it is not required from you to provide any schema or even change your dataclass decorators or class bases.
In provided example book.author == "Unknown author"
because normal dataclass constructor is called.
It is better to create factory only once, because all parsers are cached inside it after first usage. Otherwise, the structure of your classes will be analysed again and again for every new instance of Factory.
Nested objects
Nested objects are supported out of the box. It is surprising, but you do not have to do anything except defining your dataclasses. For example, your expect that author of Book is instance of Person, but in serialized form it is dictionary.
Declare your dataclasses as usual and then just parse your data.
from dataclasses import dataclass
import dataclass_factory
@dataclass
class Person:
name: str
@dataclass
class Book:
title: str
price: int
author: Person
data = {
"title": "Fahrenheit 451",
"price": 100,
"author": {
"name": "Ray Bradbury"
}
}
factory = dataclass_factory.Factory()
# Book(title="Fahrenheit 451", price=100, author=Person("Ray Bradbury"))
book: Book = factory.load(data, Book)
serialized = factory.dump(book)
Lists and other collections
Want to parse collection of dataclasses? No changes required, just specify correct target type (e.g List[SomeClass]
or Dict[str, SomeClass]
).
from typing import List
from dataclasses import dataclass
import dataclass_factory
@dataclass
class Book:
title: str
price: int
data = [
{
"title": "Fahrenheit 451",
"price": 100,
},
{
"title": "1984",
"price": 100,
}
]
factory = dataclass_factory.Factory()
books = factory.load(data, List[Book])
serialized = factory.dump(books)
Fields also can contain any supported collections.
Error handling
Currently parser doesn’t throw any specific exception in case of parser failure. Errors are the same as thrown by corresponding constructors.
In normal cases all suitable exceptions are described in dataclass_factory.PARSER_EXCEPTIONS
from dataclasses import dataclass
import dataclass_factory
@dataclass
class Book:
title: str
price: int
author: str = "Unknown author"
data = {
"title": "Fahrenheit 451"
}
factory = dataclass_factory.Factory()
try:
book: Book = factory.load(data, Book)
except dataclass_factory.PARSER_EXCEPTIONS as e:
# Cannot parse: <class 'TypeError'> __init__() missing 1 required positional argument: 'price'
print("Cannot parse: ", type(e), e)
Validation
Validation of data can be done in two cases:
per-field validations
whole structure validation
In first case you can use @validate
decorator with field name to check the data.
Here are details:
validator can be called before parsing field data (set
pre=True
) or after it.field validators are applied after all name transformations (name styles, structure flattening). So use field name as it is called in your class
validator can be applied to multiple fields. Just provide multiple names
validator can be applied to any field separately. Just do not set any field name
validator must return data if checks are succeeded. Data can be same as passed to it or anything else. Validator CAN change data
field validators cannot be set in default schema
multiple decorators can be used on single method
multiple validators can be applied to single field
use different method names to prevent overwriting
from dataclasses import dataclass
from dataclass_factory import validate, Factory, Schema, NameStyle
class MySchema(Schema):
SOMETHING = "Some string"
@validate("int_field") # use original field name in class
def validate_field(self, data):
if data > 100:
raise ValueError
return data * 100 # validator can change value
# this validator will be called before parsing field
@validate("complex_field", pre=True)
def validate_field_pre(self, data):
return data["value"]
@validate("info")
def validate_stub(self, data):
return self.SOMETHING # validator can access schema fields
@dataclass
class My:
int_field: int
complex_field: int
info: str
factory = Factory(schemas={
My: MySchema(name_style=NameStyle.upper_snake) # name style does not affect how validators are bound to fields
})
result = factory.load({"INT_FIELD": 1, "COMPLEX_FIELD": {"value": 42}, "INFO": "ignored"}, My)
assert result == My(100, 42, "Some string")
If you want to check whole structure, your can any check in pre_parse
or post_parse
step.
Idea is the same:
pre_parse
is called before structure parsing is done (but even before data is flattened and names are processed).post_parse
is called after successful parsingonly one
pre_parse
and onepost_parse
methods can be in class.
from dataclasses import dataclass
import dataclass_factory
from dataclass_factory import Schema
@dataclass
class Book:
title: str
price: int
author: str = "Unknown author"
data = {
"title": "Fahrenheit 451",
"price": 100,
}
invalid_data = {
"title": "1984",
"price": -100,
}
def validate_book(book: Book) -> Book:
if not book.title:
raise ValueError("Empty title")
if book.price <= 0:
raise ValueError("InvalidPrice")
return book
factory = dataclass_factory.Factory(schemas={Book: Schema(post_parse=validate_book)})
book = factory.load(data, Book) # No error
other_book = factory.load(invalid_data, Book) # ValueError: InvalidPrice