March, 2011

article thumbnail

Layman's Introduction to Random Forests

Edwin Chen

Suppose you’re very indecisive, so whenever you want to watch a movie, you ask your friend Willow if she thinks you’ll like it. In order to answer, Willow first needs to figure out what movies you like, so you give her a bunch of movies and tell her whether you liked each one or not (i.e., you give her a labeled training set). Then, when you ask her if she thinks you’ll like movie X or not, she plays a 20 questions-like game with IMDB, asking questions like “Is X a romant

article thumbnail

Three Amazing Web Data Analyses Techniques For Analysis Ninjas

Occam's Razor

Day in and day out we stare at standard tables and rows and convert them into smaller or scarier tables and rows and through analysis we try and move the really heavy beast called the "organization" into action. It is hard. This blog post has three ideas I've learned from other smart people, ideas that help surprise the "organization" with something non-normal and get it to take action.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Prime Numbers and the Riemann Zeta Function

Edwin Chen

Lots of people know that the Riemann Hypothesis has something to do with prime numbers, but most introductions fail to say what or why. I’ll try to give one angle of explanation. Layman’s Terms. Suppose you have a bunch of friends, each with an instrument that plays at a frequency equal to the imaginary part of a zero of the Riemann zeta function.

IT 73
article thumbnail

Hacker News Analysis

Edwin Chen

I was playing around with the Hacker News database Ronnie Roller made (thanks!), so I thought I’d post some of the things I looked at. Activity on the Site. My first question was how activity on the site has increased over time. I looked at number of posts, points on posts, comments on posts, and number of users. Posts. This looks like a strong linear fit, with an increase of 292 posts every month.

article thumbnail

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Speaker: Maher Hanafi, VP of Engineering at Betterworks & Tony Karrer, CTO at Aggregage

Executive leaders and board members are pushing their teams to adopt Generative AI to gain a competitive edge, save money, and otherwise take advantage of the promise of this new era of artificial intelligence. There's no question that it is challenging to figure out where to focus and how to advance when it’s a new field that is evolving everyday. 💡 This new webinar featuring Maher Hanafi, VP of Engineering at Betterworks, will explore a practical framework to transform Generative AI pr

article thumbnail

Layman's Introduction to Measure Theory

Edwin Chen

Measure theory studies ways of generalizing the notions of length/area/volume. Even in 2 dimensions, it might not be clear how to measure the area of the following fairly tame shape: much less the “area” of even weirder shapes in higher dimensions or different spaces entirely. For example, suppose you want to measure the length of a book (so that you can get a good sense of how long it takes to read).

article thumbnail

Netflix Prize Summary: Factorization Meets the Neighborhood

Edwin Chen

(Way back when, I went through all the Netflix prize papers. I’m now (very slowly) trying to clean up my notes and put them online. Eventually, I hope to have a more integrated tutorial, but here’s a rough draft for now.). This is a summary of Koren’s 2008 Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model.

More Trending

article thumbnail

Counting Clusters

Edwin Chen

Given a set of datapoints, we often want to know how many clusters the datapoints form. The gap statistic and the prediction strength are two practical algorithms for choosing the number of clusters. Gap Statistic. The gap statistic algorithm works as follows: For each i from 1 up to some maximum number of clusters, Run a k-means algorithm on the original dataset to find i clusters, and sum the distance of all points from their cluster mean.

article thumbnail

The Market Motive Master Certification Manifesto: Web Analytics

Occam's Razor

Many of you are aware that I am the co-Founder of Market Motive, a delightful little labor of love whose mission in life is to provide bleeding edge education via quarterly, what we call, Master Certification courses. There are seven courses in all: SEO, PPC, Social Media, Web Analytics, Conversion Optimization, Marketing Fundamentals and Online PR.

article thumbnail

Topological Combinatorics and the Evasiveness Conjecture

Edwin Chen

The Kahn, Saks, and Sturtevant approach to the Evasiveness Conjecture (see the original paper here ) is an epic application of pure mathematics to computer science. I’ll give an overview of the approach here, and probably try to add some more information on the problem in other posts. tl;dr The KSS approach provides an algebraic-topological attack to a combinatorial hypothesis, and reduces a graph complexity problem to a problem of contractibility and (not) finding fixed points.

IT 40