diff --git a/Density coloured scatterplots with ggplot2/README.md b/Density coloured scatterplots with ggplot2/README.md index 195d9455e53d9e0e1862129e71e42dae5b4159d7..15bda83774f34af34b0cfa69e30dfb9dca023906 100644 --- a/Density coloured scatterplots with ggplot2/README.md +++ b/Density coloured scatterplots with ggplot2/README.md @@ -1,10 +1,12 @@ # Density coloured scatter plots to avoid overplotting -Any time. +Looking at correlations between various features in large scale data sets is best done with scatter plots. However, when the number of values increases, the central region of scatterplots is so crowded, that no clear information about how many points are present can be observed. A way to help better visualize the data density is to add color to the points, as in the excellent [ggpointdensity](https://cran.r-project.org/package=ggpointdensity) R package, by Lukas PM Kremer that includes the `geom_pointdensity` function. However, plotting so many dots become a problem when drawing figures, as each tiny dot is rendered by the pdf viewer and there is a lot of useless information in the final files. + +The overplotting problem has been solved by FACS software, where tens and hundred of thousands of events are displayed in multiple scatter plots. A solution for R was proposed in one of the answers to [this](https://stackoverflow.com/questions/13094827/how-to-reproduce-smoothscatters-outlier-plotting-in-ggplot/59147836#59147836) question on stackoverflow and I adapted it to my own needs. Running the included example from the R script leads to this image:  -The result will need further adjustments, in Inkscape. +The result will need further adjustments, usually done in Inkscape.