EGI Document 2665-v4

EGI-Engage D6.5 Final version of Multi-Source Distributed Real-Time Search and Information Retrieval application

The multi-source distributed real-time search and analytics application (SIR) aims at bringing a next-generation search and data retrieval platform sourced by different systems and heterogeneous data to the users from the Arts and Humanities community. To achieve this, a data hub is implemented based on big data stream and batch processing techniques. The data hub is the foundation for feeding a distributed search engine that is driven by an research-specific object storage architecture. On top of this stack, an interface for the Arts and Humanities community provides big-data-enabled search and analytics capabilities to the user.
