This is Hrbrmstr's Typepad Profile.
Join Typepad and start following Hrbrmstr's activity
Join Now!
Already a member? Sign In
Recent Activity
Ethics are as important as both thoughtful, honest analyses and great visualizations. It happens to be against the Amazon Terms of Service to scrape IMDB and encouraging that act seems like a bad idea. Just because you can do something doesn't mean you should. (Strike #3?) IMO we need to teach ethics alongside analysis. Ultimately it's still your choice or a reader's choice to risk violation of Terms of Service/Use. LinkedIn sued real people last year for scraping and there are other lawsuits from other sites that can be referred to as well. Better to ask permission and/or err on the side of ethics vs risk the consequences.
For some definition of the word "directly" at least. I doubt any R person would consider this contorting: library(ggplot2) dat <- data.frame(year=2010:2015, penalties=c(627, 625, 653, 617, 661, 730)) avg <- data.frame(val=mean(head(dat$penalties, -1)), last=dat$penalties[6], lab="5-Yr\nAvg") gg <- ggplot(dat, aes(x=year, y=penalties)) gg <- gg + geom_point() gg <- gg + scale_x_continuous(breaks=c(2010, 2014, 2015)) gg <- gg + scale_y_continuous(breaks=c(600, 650, 700, 750), limits=c(599, 751), expand=c(0,0)) gg <- gg + geom_segment(data=avg, aes(x=2010, xend=2015, y=val, yend=val), linetype="dashed") gg <- gg + geom_segment(data=avg, aes(x=2015, xend=2015, y=val, yend=last), color="steelblue") gg <- gg + geom_point(data=avg, aes(x=2015, y=val), shape=4) gg <- gg + geom_point(data=avg, aes(x=2015, y=700), shape=17, col="steelblue") gg <- gg + labs(x=NULL, y="Number of Penalties", title="NFL Penalties Jumped 15% in the\nFirst 3 Weeks of the 2015 Season\n") gg <- gg + theme_bw() gg <- gg + theme(panel.grid.minor=element_blank()) gg <- gg + theme(panel.grid.major.x=element_blank()) gg <- gg + theme(axis.ticks=element_blank()) gg
You should be able to use Shiny with streamgraphs w/o issue. There are a couple issues relating to shiny but if you have any def let me know.
Toggle Commented Aug 4, 2015 on Streamgraphs in R at Revolutions
If plotly handled coord_flip() for horizontal bar charts it'd be my go-to display medium for publicly shareable data.
I don't think "junk data" is limited to "big data". I recently performed a data quality analysis of something you think you would be "good data" at this point, computer/data breaches: Generating any type of real analysis from it requires more caveats than there would be text in the report. There are components of the data set that are good enough for use in real analytics. Like Andy said, just need to figure out the best method for separating signal from noise.
Hrbrmstr is now following The Typepad Team
Feb 5, 2013