Repositories

A repository is a special-purpose service object for keeping track of objects.

Below some design principles for repositories are stated.

  • A repository is not the same as an RDBMS

    The mechanism used for keeping track of the objects can be anything.

    What is it with RDBMS as a default persistence mechanism anyway...

    All too often the management of application data is externalized from the application to a standalone data management system like a relational database management system (RDBMS). Data conserns like data aggregation, data access control, and data retrieval latency are important application design issues. Application data should by default belong to and be under strict control by the application (team) itself.

    The translations between the two-dimensional relational data model, and the three-dimensional object model are most often (yet again) externalized to an object-relational-mapping (ORM) tool. The use of an ORM tool brings tremendous, and most often unessesary, complexity into your application design. Bringing an ORM tool into a low-end application is an overkill. Also, at the other end of the scale; an RDBMS is not suitable for high-volume/high-throughput data applications unless elaborate tuning is applied. What's left is medium-range applications. One valid argument in favour of an ORM/RDBMS solution is if data is already placed in a RDBMS and in use by several other applications. Perhaps is it also already included in some data-mining activities. Only then the choice to stick with the RDBMS seems like an obvious one. Otherwise, you should think twice before choosing an ORM/RDBMS solution. At least, investigate some alternatives while you still have the chance.

  • A repository is not the same as persistence

    As stated above, a repository is a special-purpose service object for keeping track of entity objects. That is, holding your entities so they are not garbage collected. Also, your entities should be easy to retrieve and otherwise manage. Persistence, albeit necessary in most non-trivial cases, is not a mandatory requirement for a repository.

  • Repositories should be out-of-the-box

    When you need a repository, it should be just to pick an implementation, instansiate it, and start using it following a well-known API with consistent semantics. If you need another repository type inhibiting other properties, it should be easy to migrate your data. Importing and exporting application data should be a trivial task. Data should be a liquid asset.

  • Queries should have a native look-and-feel

    A Java-based application should query for objects using Java. The query expressions should be consise, highly readable, and easy to maintain. Specifications are perfect as a query construct, but it does not stop there. Specifications define object spaces, so thay are perfect for partitioning your objects as well.

Domian Repositories

Domian repositories only keep track of entity objects. This is because of the transient nature of value objects. Value objects should not be put aside and reused in any way, but rather purged and recreated when needed.

Domian repositories are described by this interface:

public interface Repository<T extends Entity> {
    <V extends T> void          put                           (V entity);
                  void          putAll                        (Collection<? extends T> collectionOfEntities);
                  Long          countAllEntitiesSpecifiedBy   (Specification<? extends T> specification);
    <V extends T> Iterator<V>   iterateAllEntitiesSpecifiedBy (Specification<V> specification);
    <V extends T> Collection<V> findAllEntitiesSpecifiedBy    (Specification<V> specification);
    <V extends T> V             findSingleEntitySpecifiedBy   (Specification<V> specification);
    <V extends T> void          update                        (V entity);
    <V extends T> Long          removeAllEntitiesSpecifiedBy  (Specification<V> specification);
    <V extends T> Boolean       remove                        (V entity);
}

The interface also contains aliased methods, many of them compacted versions. See the net.sourceforge.domian.repository.Repository Javadoc for the complete listing.

Repository types

Figure 1 - Domian Repository Types

Repository usage

Creating a Repository

Repository<Customer> customerRepo = new InMemoryRepository<Customer>();

Adding entities

import static java.util.Arrays.asList;
...
Customer myCustomer = new Customer(1L, today);
Customer anotherCustomer = new Customer(2L, yesterday);
customerRepo.put(myCustomer);
customerRepo.put(anotherCustomer);
// or just
customerRepo.putAll(asList(myCustomer, anotherCustomer));

Finding entities

Query your entities using Specifications.

import static net.sourceforge.domian.specification.SpecificationFactory.*;
...
Collection<Customer> allCustomersCollection = customerRepo.find(all(Customer.class));
...
Specification<Customer> customersRegisteredBeforeToday = all(Customer.class).where("membershipDate", isBefore(today));
Collection<Customer> customersRegisteredBeforeTodayCollection = customerRepo.findAll(customersRegisteredBeforeToday);

Iterating entities

Iterate your entities using Specifications.

Iterator<Customer> customerIterator = customerRepo.iterate(all(Customer.class).where("membershipDate", is(yesterday)));

Updating entities

When altering state for an entity obtained through a repository, it must be updated to be sure the changes are visible for other threads.

For all in-memory-based repositories object state changes are, as we all know, instantly visible. So for those repository implementations the use of update is not necessary. However, by adhering to the Repository semantics you can migrate between different repository implementation without code changes.

myCustomer.setName("Chris the Customer");
customerRepo.update(anotherCustomer);
...
// some other thread...
Specification<Customer> customersWithKnownNames = allInstancesOfType(Customer.class).where("name", is(not(a(blankString()))));
Collection<Customer> customersWithKnownNamesCollection = customerRepo.findAll(customersWithKnownNames);

Removing entities

customerRepo.remove(myCustomer);

Persistent repositories

A repository does not by default persist its stored entities. The PersistentRepository interface extends the Repository interface and adds persistence capabilities.

public interface PersistentRepository<T extends Entity> extends Repository<T> {
    String                getRepositoryId();
    PersistenceDefinition getPersistenceDefinition();
    void                  load();
    void                  persist();
    void                  close();
}

See the net.sourceforge.domian.repository.PersistentRepository Javadoc for the complete listing.

The persistence definition

All PersistentRepository instances are accompanied by one single mandatory PersistenceDefinition enum object, avalable via the the PersistentRepository.getPersistenceDefinition() method. Table 1 shows all PersistentRepository enums with some of the their defined persistence semantics.

Table 1 - Domian persistence definitions
NamePersistentMode
PersistenceDefinition.INMEMORYnoN/A
PersistenceDefinition.FILEyessynchronous
PersistenceDefinition.DELEGATEDyessynchronous
PersistenceDefinition.INMEMORY_AND_FILEyesasynchronous
PersistenceDefinition.INMEMORY_AND_DELEGATEDyesasynchronous

A synchronous repository communicates directly with the persistence mechanism; put, update, and remove operations are instantly reflected in the persistent medium used. FILE means that the repository implementation uses the file system directly. DELEGATED means that the repository implementation delegates to some external persistence framework/product.

An asynchronous repository on the other hand, communicates to an in-memory object graph when using the Repository methods. Asynchronous repositories must explicitly invoke the load/persist operations for reading/writing the object graph to the persistent medium. Invoking the load/persist operations on synchronous repositories has no effect.

Creating/opening a persistent repository

Persistent repositories must be given a repository ID.

PersistentRepository<Customer> customerRepo = new InMemoryAndXStreamXmlFileRepository<Customer>("customers");

Loading existing entities into a repository

For asynchronous repositories the existing persistent entities must be read into memory once the repository is created.

customerRepo.load();

Persisting entities

For asynchronous repositories the existing persistent entities must be read into memory once the repository is created.

Customer myCustomer = new Customer(1L, today);
Customer anotherCustomer = new Customer(2L, threeDaysAgo);
customerRepo.putAll(asList(myCustomer, anotherCustomer));

customerRepo.persist();

Again, for synchronous repositories, loading and persisting entities is of no consequence. It is in fact recommended doing so because it is part of the PersistentRepository semantics. By doing so you can switch from a synchronous repository to an asynchronous one without any code changes. (If partitioning is used for this repository switch, the data is automatically migrated as well; more on that below.)

Closing a repository

Releases all repository resources, if any.

customerRepo.close();

Partitioned repositories

Partitioned repositories will not consist of a single repository instance, but rather a directed acyclic graph (DAG) of repository instances. Each repository partition is defined by a Specification, being the partition specification.

All entities approved by a partition specification are placed in that partition. The same entity may of that reason reside in several partitions. The exception is the root partition; if an entity belongs to any sub-partition, it is removed from the root partition.

All partitions are organized following the semantics of specification subsumptions; isGeneralizationOf, isSpecialCaseOf, and isDisjointWith.

public interface PartitionRepository<T extends Entity> extends Repository<T> {
    <V extends T> PartitionRepository<V> addPartitionFor(Specification<V> partitionSpecification);
    <V extends T> PartitionRepository<V> addPartitionFor(Specification<V> partitionSpecification, String repositoryId);
    <V extends T> PartitionRepository<V> addPartitionFor(Specification<V> partitionSpecification, Repository<? extends V> partitionRepository);
    <V extends T> PartitionRepository<V> findPartitionFor(Specification<V> specification);
    <V extends T> Boolean                repartition(V entity);
                  void                   repartition();
}

See the net.sourceforge.domian.repository.PartitionRepository Javadoc for the complete listing.

Creating a partitioned repository

PartitionRepository<Entity> repo = new InMemoryRepository<Entity>().makePartition();

Creating partitions

When partitioning a repository, you just hand over the partition specification. A partition will be created using a repository implementation of the same type as the repository from it was added.

repo.addPartitionFor(all(Order.class));

You may also, in addition to the partition specification, hand over a repository intance to be used for that particular partition.

Specification<Customer> activeCustomers = all(Customer.class).where("orders", haveSizeOf(moreThan(100)));
...
repo.addPartitionFor(all(Customer.class), new InMemoryAndXStreamXmlFileRepository<Customer>("customers"));
repo.addPartitionFor(activeCustomers, new InMemoryAndXStreamXmlFileRepository<Customer>("active-customers"));

Usage scenarios

Partitioned repositories may be useful for several reasons.

Improving search time / projection latency

For certain applications it makes sence to improve the read-time on the behalf of write-time. If the type of entities to be frequently read are known in advance, a specification of those entity types may be defined and used to partition your repository. Here is a simple example:

Specification<Pearl> blackPearls = all(Pearl.class).where("colour", is(BLACK));
Specification<Pearl> greyPearls = all(Pearl.class).where("colour", is(GREY));

PartitionRepository<Pearl> pearlRepo = new HashSetRepository<Pearl>().makePartition();
pearlRepo.addPartitionFor(blackPearls);

// insert a lot of pearls...

Collection<Pearl> blackPearlCollection = pearlRepo.findAll(blackPearls);
Collectio<Pearl> greyPearlCollection = pearlRepo.findAll(greyPearls);

Here, the retrieval of black pearls now will take considerably less time than the grey ones. The partitions may be added prior or after the entities are put into the repository. When adding partitions after the entities are added, it will, of course, take some more time to add the partitions than when adding partitions before the entities are added.

Taking advantage of different repository implementations
PartitionRepository<Entity> repo = new InMemoryRepository<Entity>().makePartition();                                    // INMEMORY
repo.addPartitionFor(all(Configuration.class),         new XStreamXmlFilePerEntityRepository<Configuration>("config")); // FILE
repo.addPartitionFor(all(Customer.class),              new InMemoryAndXStreamXmlFileRepository<Customer>("customers")); // INMEMORY_AND_FILE
repo.addPartitionFor(all(Order.class),                 new InMemoryAndXStreamXmlFileRepository<Order>("orders"));       // INMEMORY_AND_FILE
repo.addPartitionFor(all(OrderProcessingTicket.class), new InMemoryRepository<OrderProcessingTicket>());                // INMEMORY

The reason for using different repository implementations is obvious; different partitions containing different types of entities may need different repository attributes.

In the example above we are taking advantage of different repository implementations particularly selected for each partition. The default root partition is an InMemoryRepository, just to be sure nothing is unnecessarily persisted. For configuration stuff, a direct-to-file XStreamXmlFilePerEntityRepository is used. For the main domain classes, a processing-friendly asynchronous InMemoryAndXStreamXmlFileRepository is selected. And, at last, for transient entities like OrderProcessingTicket, an InMemoryRepository also is used. This last partition is not necessary to add as it would have gone into the root partition anyway, but it is shown just to emphasize how transient entities should be treated.

Migrating data

By re-adding a partition with another repository implementation, all residing entities in that partition will be automatically copied to the new partition repository. The example below shows data moved from an asynchronous InMemoryAndXStreamXmlFileRepository repository, to a synchronous XStreamXmlFilePerEntityRepository repository,

PersistentRepository<Entity> repo = new InMemoryAndXStreamXmlFileRepository<Entity>("entities");
Customer myCustomer = new Customer(1L, today);
Customer anotherCustomer = new Customer(2L, threeDaysAgo);
repo.putAll(asList(myCustomer, anotherCustomer));
repo.persist();

// migrate
PersistentRepository<Entity> newRepo = new XStreamXmlFilePerEntityRepository<Entity>("migrated_entities");
PartitionRepository migratedRepo = repo.makePartition().addPartitionFor(all(Entity.class), newRepo);

// clean up old repo
repo.remove(all(Entity.class));
repo.persist();

The replaced (phantom) partition repository must be manually removed as shown in the two last code lines... In future versions, this will not be necessary, as migrating data using partitions will get more dedicated API-support.

Partitioning with specifications makes it possible for fine-grained data migration.

Repository implementations

Table 2 - Repository implementations so far. See below for some content guidance.
NameConcurrentPersistentPartitionableStores aggregate root only
n.s.d.r.HashSetRepositorynonoyesyes
n.s.d.r.InMemoryRepositoryyesnoyesyes
n.s.d.r.XStreamXmlFilePerEntityRepositoryyesFILEyesyes
n.s.d.r.InMemoryAndXStreamXmlFileRepositoryyesINMEMORY_AND_FILEyesyes
n.s.d.r.HibernateRepositoryyesDELEGATEDnot yet implementedno, also recursively stores member entities
n.s.d.r.Db4oRepositoryyesDELEGATEDyesno, also recursively stores member entities

The n.s.d.r is short for net.sourceforge.domian.repository.

In addition to the repository implementations showed in Table 1, there are also several fake repositories.

The Persistent column states what kinds of PersistenceDefinitons the repository implementation supports, if it is persistent that is.

All Domian repositories (based on the domian-core module) are partitionable via the AbstractRepository.makePartition() method.

When you put a particular entity into a repository, it becomes searchable via the find* methods. The entities' member entities on the contrary, may be searchable via the find* methods. It depends on whether the repository implementation stores aggregates roots only, or member entities as well.

The Db4oRepository will be available in Domian v0.5.2.