diff --git a/datasets/housing/README.md b/datasets/housing/README.md index 9d0fa4f..1a9e002 100644 --- a/datasets/housing/README.md +++ b/datasets/housing/README.md @@ -1,7 +1,7 @@ # California Housing ## Source -This dataset is a modified version of the California Housing dataset available from [http://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html](Luís Torgo's page) (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors. +This dataset is a modified version of the California Housing dataset available from [Luís Torgo's page](http://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html) (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors. This dataset appeared in a 1997 paper titled *Sparse Spatial Autoregressions* by Pace, R. Kelley and Ronald Barry, published in the *Statistics and Probability Letters* journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people). @@ -60,4 +60,4 @@ Note that the block groups are called "districts" in the Jupyter notebooks, simp 50% 433.000000 1164.000000 408.000000 3.541400 75% 644.000000 1718.000000 602.000000 4.745000 max 6210.000000 35682.000000 5358.000000 15.000100 - \ No newline at end of file +