In several economical, statistical and geographical applications, a territory must be subdivided into functional regions. Such regions are not fixed and politically delimited, but should be identified by analyzing the interactions among all its constituent localities. This is a very delicate and important task, that often turns out to be computationally difficult. In this work we propose an innovative approach to this problem based on the solution of minimum cut problems over an undirected graph called here transitions graph. The proposed procedure guarantees that the obtained regions satisfy all the statistical conditions required when considering this type of problems. Results on real-world instances show the effectiveness of the proposed approach.
In the case of large-scale surveys, such as a Census, data may contain errors or missing values. An automatic error correction procedure is therefore needed. We focus on the problem of restoring the consistency of agricultural data concerning cultivation areas and number of livestock, and we propose here an approach to this balancing problem based on Optimization. Possible alternative models, either linear, quadratic or mixed integer, are presented. The mixed integer linear one has been preferred and used for the treatment of possibly unbalanced data records. Results on real-world Agricultural Census data show the effectiveness of the proposed approach.
Error localization problems can be converted into Integer Linear Programming problems. This approach provides several advantages and guarantees to find a set of erroneous fields having minimum total cost. By doing so, each erroneous record produces an Integer Linear Programming model that should be solved. This requires the use of specific solution softwares called Integer Linear Programming solvers. Some of these solvers are available as open source software. A study on the performance of internationally recognized open source Integer Linear Programming solvers, compared to a reference commercial solver on real-world data having only numerical fields, is reported. The aim was to produce a stressing test environment for selecting the most appropriate open source solver for performing error localization in numerical data
In the case of some large statistical surveys, the set of units that will constitute the scope of the survey must be selected. We focus on the real case of a Census of Agriculture, where the units are farms. Surveying each unit has a cost and brings a different portion of the whole information. In this case, one wants to determine a subset of units producing the minimum total cost for being surveyed and representing at least a certain portion of the total information. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The proposed approach is based on combinatorial optimization, and the arising decision problems are modeled as multidimensional binary knapsack problems. Experimental results show the effectiveness of the proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.