• +48 85 74 83 330
  • This email address is being protected from spambots. You need JavaScript enabled to view it.

Case study

Data Warehouse and CEIDG Company Search Engine


Development, modernization and maintenance of a system supporting company search and reporting

For the Ministry of Development, Labour and Technology, we are developing, modifying and maintaining the Data Warehouse, including the construction of the Company Search Engine used by solutions available on the biznes.gov.pl portal.

The project covers the development of a system that processes and provides access to entrepreneur data, supporting both information search and the generation of statistical and analytical reports. The solution plays an important role in the public e-services ecosystem, as it took over a significant part of search and reporting queries that had previously been directed to the CEIDG system.

The project is being carried out in 2021–2026 and includes maintenance of the existing solution, further development, technological modernization and expansion with new modules.

Challenge

The client needed a flexible search engine that could power the interface available on biznes.gov.pl. The engine had to enable fast and precise searching of resources, while meeting detailed guidelines regarding search behavior and data presentation.

Another important challenge was the growing popularity of the solution. As the number of users increased, system load continued to grow, which required ongoing maintenance support, performance optimization and architectural development.

The project was also important because it helped offload the CEIDG system. The Data Warehouse and Company Search Engine were intended to take over search and reporting queries that had previously been sent directly to CEIDG. This made it possible to separate transactional, search and reporting workloads.

Key technical challenges

One of the largest tasks was rebuilding the Data Warehouse from a transaction-oriented structure into a task-oriented model better suited to reporting and search needs.

Before the modernization, the reporting layer was based on a distributed information structure that had not been designed around specific reporting tasks. The modernization involved reorganizing data storage so that data would be prepared for specific reports and usage scenarios, without redundant information burdening the reporting process.

As a result, each report received a dedicated data source, properly indexed and adjusted to user needs. This approach improved performance, made system behavior more predictable and better prepared the solution for handling a growing number of queries.

Another major challenge was rebuilding all APIs exposed by the Data Warehouse from Java to .NET. This technology change unlocked further development opportunities for the entire solution and made it possible to better adapt the system to current technological and legal requirements.

Project limitations

An important limitation was dependency on components used in the original solution. Some of these components changed their licensing models, while others had licenses that limited further system development.

For this reason, the project required not only functional maintenance and expansion, but also gradual reduction of dependency on elements that limited further modernization. This was particularly important for a public-sector solution that must be developed over the long term and adapted to changing user needs, legal requirements and technologies.

Scope of work

As part of the project, we are responsible for maintaining, developing and extending the system with new modules.

The scope of work includes, among others:

  • development and maintenance of the Data Warehouse,
  • construction and development of the Company Search Engine,
  • API modernization,
  • redesign of data processing workflows,
  • development of reporting mechanisms,
  • performance optimization,
  • adaptation to legal changes,
  • expansion of integrations,
  • environment maintenance and operational support.

Our solution

As part of the project, we are developing a system consisting of modules responsible for search, data access, reporting, and data processing and preparation workflows.

Company Search Engine Module

The Company Search Engine Module powers the entrepreneur search function available on biznes.gov.pl. Its task is to provide fast and flexible access to searchable data in a way that meets the requirements of the user interface and the expectations of end users.

The engine was designed to handle a growing volume of queries and to relieve CEIDG from search requests that had previously been directed to the source register.

API V3 Module

The API V3 Module enables automated browsing of the entrepreneur database using a REST API. It was adapted to current technological realities and applicable legal requirements.

The API modernization made it possible to organize the way data is exposed, increase development possibilities and prepare the system for further integration with other public services and external data consumers.

ETL Process

The ETL process is responsible for processing data collected by the Data Warehouse. It prepares data by cleaning it, removing corrupted information and standardizing address data.

The purpose of this process is to improve data quality and maximize the accuracy of statistics and reports. As a result, data used in reports and in the search engine is better aligned with user needs and analytical scenarios.

Reporting

The Data Warehouse provides 41 reports, divided into three main groups.

The first group consists of reports related to applications submitted to CEIDG — 18 reports in total, including:

  • 16 reports broken down by voivodeship,
  • 1 report for companies without a specified place of business activity,
  • 1 report presenting the volume of submitted applications by type.

The second group consists of reports related to businesses registered in CEIDG, broken down by operating status — also 18 reports, including:

  • 16 reports broken down by voivodeship,
  • 1 report for companies without a specified place of business activity,
  • 1 report presenting the number of businesses by status: active, suspended or removed from the register.

The third group consists of statistical reports on business activity and applications — 5 reports, including:

  • 2 reports presenting the age structure of entrepreneurs,
  • 1 report on business activity status,
  • 2 reports presenting processed applications on an annual and daily basis.

Integrations

The system works with solutions and registers used in the public e-services ecosystem, including:

  • biznes.gov.pl,
  • KRS — National Court Register.

These integrations make it possible to use entrepreneur data in public services and handle search and reporting queries in a more efficient and structured way.

Technologies

The project uses, among others:

  • Rocky Linux,
  • Windows Server,
  • Microsoft SQL Server,
  • MongoDB,
  • .NET Core,
  • C#,
  • Elasticsearch,
  • GitLab,
  • Kibana,
  • Logstash,
  • SQL Server Integration Services,
  • Apache,
  • IIS,
  • Nginx,
  • Gravitee,
  • Wyn Enterprise.

Such a broad technology stack reflects the nature of the project, which combines data processing, search, reporting, API integrations, environment maintenance and system monitoring.

Results

The development of the Data Warehouse and the construction of the Company Search Engine delivered several key results.

The most important outcomes include:

  • increased solution performance,
  • offloading CEIDG from search and reporting queries,
  • delivery of new functionalities,
  • adaptation of the system to legal changes,
  • expansion of the integration scope,
  • API modernization into a .NET-based architecture,
  • organization of the data model around specific reporting tasks,
  • preparation of dedicated data sources for reports,
  • improved data quality through ETL processes,
  • better scalability and easier further development of the system.

Project organization

The project is being carried out using an agile methodology, which allows the team to respond continuously to changing client needs, develop new functionalities and adapt the system to legal and technological changes.

The project team includes:

  • contractor’s team leader,
  • IT systems architect/designer,
  • system analyst,
  • deployment specialist,
  • tester,
  • database administrator,
  • search engine specialist,
  • developer.

The combination of analytical, architectural, development, database, deployment and maintenance competencies makes it possible to develop a large-scale system of significant importance for public administration.

Summary

The development of the Data Warehouse and the construction of the Company Search Engine support public digital services related to access to entrepreneur information.

The solution made it possible to offload CEIDG by taking over search and reporting queries, while creating a more flexible, efficient and task-oriented data processing architecture. The redesign of the Data Warehouse, development of API V3 and ETL processes, and the use of an Elasticsearch-based search engine created a foundation for further development of services available on biznes.gov.pl.

The project demonstrates the importance of modern data architecture in public administration — especially where the system must handle growing traffic, changing legal requirements and the needs of users relying on online services.


Centrum Informatyki "ZETO" S.A.

ul. Skorupska 9
15-048 Białystok
+48 85 74 83 303

Poznań Branch

ul. Unii Lubelskiej 1 bud. 2 lok. 16
61-249 Poznań
NIP: 542-020-28-07
KRS: 0000012499
Working hours: 7.30 - 15.30
Showroom: 8.00 - 16.00