Pre-lecture materials
Read ahead
- Take a look at the data in this recent paper: Winchell et. al, (2023) Genome-wide parallelism underlies contemporary adaptation in urban lizards. PNAS 120 (3) e2216789120 Check SLACK channel for pdf
Watch ahead: A Motivating Example for the whole endeavor of data analysis
In the age of Artificial Intelligence (AI), we have many data-intensive tools available. But can we just throw more data at a problem to get better outcomes? Please watch this thought provoking short talk by Sebastian Wernicke “How to use data to make a hit TV show”… What goes wrong when we look for decisions in the wrong places r emojifont::emoji('palm_tree')
Can we design a hit TV show (or anything of importance) with data?
The Wizard of Oz (1939) starring Judy Garland was the first major motion picture in color via the complex Technicolor process. It was a tremendous success. I remember my father telling me about what a huge event it was when it came to his small town. Yet, would it have been an inevitable success? Even at that time, there were many other movies, and there were a lot of doubts about whether the American audience would accept the fantasy story, the length, the musical choices, the actors, and so many other variables. Even though the movie landscape was much simpler then, it was still multivariate.
Yet there must be something to the analysis of data into human behavior. Internet companies sell our web browsing history, there are still opinion polls, demographic surveys, and the like. Do these types of data differ? Can we do better?
- The first study produced a TV show that was perfectly average. How do you imagine they approached the data, and what might have been the difference with the study that led to the hit show?
- How do shows become hits? Is the underlying mechanim complex? Or is the predictive data complex? or both? What contributes to complexity?
- What is a good role for data analysis in this type of question?
- Can we really “let the data tell us the answer”? What does such a statement leave unstated?
Genome-wide parallelism underlies contemporary adaptation in urban lizards
This is a hot-off-the-presses study into a hot topic.
- What are the questions in this study?
- What are the types of data?
- At the most basic level, what are the questions in the data analysis?
When we break it down, what can we do with data?
- Same vs. different
- Similar vs. less similar
- Moving in the same direction
- Are groupings real?
- Larger vs. smaller
- Predictive order
Can we identify these data comparisons in Winchell et. al, (2023)?