FOSS4G 2024 | The Digital Module of the IS_Agro Project
Автор: FOSS4G
Загружено: 2025-09-29
Просмотров: 30
Описание:
"The IS_Agro project is an initiative focused on the critical evaluation and subsequent adaptation of methodologies designed in global forums, with a view to their application in the national context based on the development of new agro-socio-environmental metrics and indicators (IASs) that aim to provide a more accurate and authentic representation of the agricultural landscape in the national territory. IASs are measures used to monitor and evaluate agricultural performance related to social, economic and environmental aspects, thus having great importance in guiding more sustainable political strategies and agricultural practices, whether by the public or private entity, serving “to evaluate the performance of agriculture in terms of its environmental, social and economic performance, providing comparative data and information between federative entities or countries, among several other applications” (EMBRAPA SOLOS, 2023). In this project, IASs are developed by different teams specialized in the proposed themes, whose works are previously approved and published in the scientific arena. To automate data collection, allocation, calculations and constant updates of the IASs, there is a team called the Digital Module, which develops solutions for each indicator, transforming them into digital algorithms. Structured, semi-structured and unstructured registration data are collected and stored in a data lakehouse, requiring a great deal of organization within the repository so that the data is always available and easily accessible. It was decided to implement the medallion architecture (medal architecture), which consists of allocating data in three layers with different purposes, while an open source platform was used for pipeline management and automation.
The conception of this project as a digital platform linked to the Brazilian Agricultural Observatory aims to publish indicators and parameters derived from well-founded technical and scientific data, capable of evaluating the effective performance of the national agricultural sector at the municipal or state level, contributing to sectoral policies and planning and management processes aimed at building sustainable agriculture and the correct positioning of the country on the international scene. Thus, the general objective is to develop an intelligent environment that automates and manages the IAS pipelines in a data storage organization environment based on the medallion architecture to be the basis of the data panel for publishing the indicators.
A data pipeline is a succession of connected phases that enable the collection, storage, modification, analysis, and representation of data, with the purpose of acquiring meaningful insights and supporting informed choices (CALANCA, 2023). A data lakehouse, the destination of the project pipelines, is “like a modern data platform built from a combination of a data lake and a data warehouse” (ORACLE CLOUD INFRASTRUCTURE, 2023), using “the flexible storage of unstructured data from a data lake and the management capabilities and tools of data warehouses, and then strategically deploying them together as a larger system” (ORACLE CLOUD INFRASTRUCTURE, 2023). The medallion architecture is the sequential structuring of data storage that aims to logically organize the data in the lakehouse, aiming to incrementally and progressively improve the structure and quality of the data as it flows through the three layers of the architecture (ARQUITETURA medallion, 2024). The terms bronze (raw data from the source), silver (transformation and validation of the data), and gold (refined and enriched data for use in projects) describe the quality of the data during the process (SKAYA et al, 2024) . Pipeline management is performed by Apache Airflow (version 2.44), an open-source platform for developing, scheduling, and monitoring batch-oriented workflows based on the Python programming language, which allows you to create workflows connected to virtually any technology (WHAT is Airflow™?, 2023). The Airflow execution environment was structured in Docker, an open-source platform that allows you to create and manage containers as modular virtual machines that contain the essentials for their execution. The developed image is available on GitHub.
Carlos Eduardo Mota
Use cases & applications"
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: