March, 2011

article thumbnail

Three Amazing Web Data Analyses Techniques For Analysis Ninjas

Occam's Razor

Day in and day out we stare at standard tables and rows and convert them into smaller or scarier tables and rows and through analysis we try and move the really heavy beast called the "organization" into action. It is hard. This blog post has three ideas I've learned from other smart people, ideas that help surprise the "organization" with something non-normal and get it to take action.

Metrics 87
article thumbnail

Prime Numbers and the Riemann Zeta Function

Edwin Chen

Lots of people know that the Riemann Hypothesis has something to do with prime numbers, but most introductions fail to say what or why. I’ll try to give one angle of explanation. Layman’s Terms. Suppose you have a bunch of friends, each with an instrument that plays at a frequency equal to the imaginary part of a zero of the Riemann zeta function.

IT 73
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Market Motive Master Certification Manifesto: Web Analytics

Occam's Razor

Many of you are aware that I am the co-Founder of Market Motive, a delightful little labor of love whose mission in life is to provide bleeding edge education via quarterly, what we call, Master Certification courses. There are seven courses in all: SEO, PPC, Social Media, Web Analytics, Conversion Optimization, Marketing Fundamentals and Online PR.

article thumbnail

Hacker News Analysis

Edwin Chen

I was playing around with the Hacker News database Ronnie Roller made (thanks!), so I thought I’d post some of the things I looked at. Activity on the Site. My first question was how activity on the site has increased over time. I looked at number of posts, points on posts, comments on posts, and number of users. Posts. This looks like a strong linear fit, with an increase of 292 posts every month.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Layman's Introduction to Measure Theory

Edwin Chen

Measure theory studies ways of generalizing the notions of length/area/volume. Even in 2 dimensions, it might not be clear how to measure the area of the following fairly tame shape: much less the “area” of even weirder shapes in higher dimensions or different spaces entirely. For example, suppose you want to measure the length of a book (so that you can get a good sense of how long it takes to read).

article thumbnail

Netflix Prize Summary: Factorization Meets the Neighborhood

Edwin Chen

(Way back when, I went through all the Netflix prize papers. I’m now (very slowly) trying to clean up my notes and put them online. Eventually, I hope to have a more integrated tutorial, but here’s a rough draft for now.). This is a summary of Koren’s 2008 Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model.

More Trending

article thumbnail

Counting Clusters

Edwin Chen

Given a set of datapoints, we often want to know how many clusters the datapoints form. The gap statistic and the prediction strength are two practical algorithms for choosing the number of clusters. Gap Statistic. The gap statistic algorithm works as follows: For each i from 1 up to some maximum number of clusters, Run a k-means algorithm on the original dataset to find i clusters, and sum the distance of all points from their cluster mean.

article thumbnail

Layman's Introduction to Random Forests

Edwin Chen

Suppose you’re very indecisive, so whenever you want to watch a movie, you ask your friend Willow if she thinks you’ll like it. In order to answer, Willow first needs to figure out what movies you like, so you give her a bunch of movies and tell her whether you liked each one or not (i.e., you give her a labeled training set). Then, when you ask her if she thinks you’ll like movie X or not, she plays a 20 questions-like game with IMDB, asking questions like “Is X a romant

article thumbnail

Netflix Prize Summary: Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights

Edwin Chen

(Way back when, I went through all the Netflix prize papers. I’m now (very slowly) trying to clean up my notes and put them online. Eventually, I hope to have a more integrated tutorial, but here’s a rough draft for now.). This is a summary of Bell and Koren’s 2007 Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights paper. tl;dr This paper’s main innovation is deriving neighborhood weights by solving a least squares problem, instead of