Four short links: 13 February 2012

  1. Rise of the Independents (Bryce Roberts) -- companies that don't take VC money and instead choose to grow organically: indies. +1 for having a word for this.
  2. The Performance Golden Rule (Steve Souders) -- 80-90% of the end-user response time is spent on the frontend. Check out his graphs showing where load times come from for various popular sites. The backend responds quickly, but loading all the Javascript and images and CSS and embedded autoplaying videos and all that kerfuffle takes much much longer.
  3. Starry Night Comes to Life -- wow, beautiful, must-see.
  4. MapReduce Patterns, Algorithms, and Use Cases -- In this article I digest a number of MapReduce patterns and algorithms to give a systematic view of the different techniques that can be found in the web or scientific articles. Several practical case studies are also provided. All descriptions and code snippets use the standard Hadoop’s MapReduce model with Mappers, Reduces, Combiners, Partitioners, and sorting.

from O'Reilly Radar - Insight, analysis, and research about emerging technologies. http://radar.oreilly.com/2012/02/four-short-links-13-february-2.html?utm_sour...

How to create a visualization

Over the last few years I've created a few popular visualizations, a lot of duds, and I've learned a few lessons along the way. For my latest analysis of where Facebook users go on vacation, I decided to document the steps I follow to build my visualizations . It's a very rough guide, these are just stages I've learned to follow by trial and error, but following these guidelines is a good way to start if you're looking to create your first visualization.

Play with your data

I was lucky enough to spend a few hours with Andreas Weigend recently, head of the Stanford Social Data lab. He has nine rules of data, and the first is "Start with the problem, not the data." What struck me about visualizations is that I actually take the opposite approach. I find the only way to begin is to explore what information is available and get a feeling for what stories it can tell.

In my case, we have a Cassandra cluster with information on more than 350 million photos shared on Facebook. I've been running Pig analytics jobs regularly to get a view of what we have in there. One of the reports we generate is a count of how many photos and users we have for particular places:

Data source example
Click to enlarge.

I was chatting with my colleague Chris Raynor about this, and he asked me if we could tell where all the visitors to those places were coming from. This was something that had been at the back of my mind for a long time. Seeing how much information we had on each destination made me realize we had enough data to produce significant and meaningful answers.

When I was learning engineering, one of my favorite case studies was an investigation into an air-traffic control system. Software engineers couldn't understand why fully-computerized control rooms were actually less efficient and safe than more old-fashioned sites. What the researchers discovered was that the old process of passing around and arranging small cards that each represented a plane gave controllers a much stronger awareness of the situation than a screen that didn't require their involvement for tasks, such as handing an aircraft to a colleague. I think the same is true of data. The more time you spend manipulating and examining the raw information, the more you understand it at a deep level. Knowing your data is the essential starting point for any visualization.

Pick a question

Now that I had a rough idea for what I wanted to visualize, I really needed to focus on what I would be doing. The best way to do that is to chose the exact title you want to give your visualization. I actually messed this up on one early map I created, giving the blog post the title "How to split up the US." Everyone subsequently described it as "The Five Nations of Facebook." Since then, I've tried very hard to pick the most natural title for what I'm going to be presenting, and then ensure I can deliver on the promise of the headline.

In this case I had a clear idea of the question at the start, it was going to be "Where do people go on vacation?". However, as I thought about it, I realized it needed to be a lot more specific and concrete. There's already a lot of "top travel destinations" lists out there, so what made mine different? It was the use of Facebook to gather much richer and more detailed information, so I refined it to "Where do Facebook users go on vacation?".

Sketch out your presentation

I now had the data and a question I wanted to answer. The next step was figuring out how to show the information in a visual form. I'm in love with network diagrams showing connections between thousands of objects, but so often they are completely baffling to the rest of the world. I still remember David Cohen threatening to strangle me if I showed him another one of "those damn spider webs" instead of a business plan. However, network diagrams are a good way of hinting at how much data is available for querying; they can really give an idea of the sheer scale of what's there.

One of my favorite recent visualizations was Paul Butler's map of friendships on Facebook, so I decided to use that as a visual reference:

Paul Butler's Visualizaing Friendships visualization
See the full version of Paul Butler's "Visualizing Friendships" visualization.

I borrowed a couple of key ideas from his work: the general color palette of the blue lines on a dark background and the use of great circles to create flowing arcs for all connections.

As I thought about the presentation, I realized that I had to simplify what it would be showing. With sources and destinations plotted all over the world, both the visual look and the querying interface would be overwhelming. Our user-base is primarily American thanks to our reliance on English-only natural language processing, so with that in mind I decided to make life simpler by only showing data from people who lived in the U.S. Accordingly, I changed the question in my title to "Where do American Facebook users go on vacation?".

While I'm mostly presenting this as a linear, waterfall process, what I've just described is a good example of how iterative cycles drive the real workflow. It's hard to know how well a lot of things will work until you try them. As you're still making some progress, don't worry if you find yourself going in circles.

Crunch the data

If you know your data, and you have a good idea of the question you're trying to answer, this should be the simplest stage. You'll hopefully have a clear set of requirements and it's just a matter of executing the right queries over your data.

In this case I already had some Pig scripts asking similar questions, so I was able to adapt one of those. The biggest surprise was when I ran into issues with some of the joins. The hard part was running the Hadoop job to gather the raw data from our Cassandra cluster, and that worked. I was able to output smaller files containing the gathered data, and then run a local Pig job to do the joins I needed.

The next stage was turning the raw information into a form that could be displayed. For example, I needed to take all of the user locations from the unstructured text strings that Facebook gave me, and convert them into latitude-longitude coordinates for plotting on a map. For this sort of work I usually turn to a general-purpose scripting language, and most of Jetpac is already written in Ruby, so that was an easy choice. I wrote a script that walked through the data, using the Data Science Toolkit to match coordinates with names, and then output it into a file containing a JSON array of all the information.

Build an interface

A lot of the best visualizations have no interactivity. They just tell a story with a static image. That's why it's worth considering whether you need an interface at all. I actually had the interactive site that I used to create the "Five Nations of Facebook" visualization up for several weeks before that post, and nobody used it because it was too confusing. It was only when I boiled it down into a single picture with labels that it became a hit.

My problem is that I want other people to have as much fun exploring the data as I've had, so I couldn't resist adding some interaction to the vacation visualization. I still wanted to retain the immediate visual appeal of a static image, so I decided to create a background showing the full data to introduce the visualization at a first glance, and then overlay an interactive foreground once the user started exploring it more deeply.

In most cases you're better off using one of the excellent off-the-shelf visualization frameworks like D3. Since I needed something client-side for interaction, and was working with both geographic and network rendering, I couldn't find anything that met my requirements. Instead I cannibalized one of my own projects, the jQuery component from OpenHeatMap, and combined it with HTML5 canvas rendering to produce a custom JavaScript renderer. I used it to pre-render a background containing all the possible connections between home towns and travel destinations, and saved that off as a static image. That's useful to save rendering time on page load, and lets me fall back to a static visualization on older browsers that don't support Canvas.

Background image of Facebook vacation visualization
Click to enlarge.

I then tied in rendering the connections of any places that the user was hovering their cursor over, so that they could quickly get a feel for the relationships expressed in the data. I also wanted to display the details underlying the picture, so to drill down I added a dialog listing the raw statistics about a place. Users can bring this dialog up by clicking.

Facebook vacation visualization dialog box
Click to enlarge.

One problem with that interaction is that a lot of different cities are in a very small area, so it becomes extremely difficult to pick the one you want with the mouse cursor. To make that a little better, I prioritized the most popular U.S. cities so that in case of a conflict, they're chosen over their smaller neighbors. I realized I also needed to add a search box. Thankfully we're heavy users of Twitter's Bootstrap framework, so it was a simple matter to add a search field and tie it in with Twitter's excellent autocomplete component.

Find the surprises!

I build these visualizations so I can explore them myself, so my favorite part of the whole process is the chance to sit and play with the results. There's always unexpected stories hidden in there, and I love uncovering them. For example, who knew that the city that had the most visitors to Paris was West Hollywood? When I lived in Los Angeles I used to love popping by the wonderful patisseries. Now I know why they're so good! These little details are the stories that catch people's imagination and cause them to spread the word, so think about writing a few of them up to help visitors understand what the page can tell them.

You'll never know whether one of your visualizations will become popular ahead of time, but the real reward is enjoying your own work. I hope this short guide gives you some ideas for visualizations you want to build. I look forward to seeing what you come up with.

See the full Facebook vacation visualization
See the full visualization.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20


Related:


from O'Reilly Radar - Insight, analysis, and research about emerging technologies. http://radar.oreilly.com/2012/02/how-to-create-visualization-facebook-vacatio...

The Social Media Salary Guide [INFOGRAPHIC]

Social Media Week is upon us, so we thought it would be appropriate to delve into the social media industry and see how its salaries stack up. Social media is an evolving and cutting-edge field, so it should come as no surprise that you can make a great living managing a brand’s presence on Twitter, Facebook, YouTube, Tumblr, LinkedIn, Google+, Pinterest, Instagram, Foursquare and other social platforms.

In the infographic below, produced by OnwardSearch, you can see where the social media jobs are concentrated, the breakdown of job titles in the industry, and how much dough the average social mediate is bringing home each year. (The graphic shows the 25th and 75th percentiles for salary, pulled from Indeed).

Does this stack up with what you’ve seen in the industry? Do you think these positions and the salaries make sense, given the rise of social media? Let us know in the comments.


Infographic courtesy of OnwardSearch


Social Media Job Listings


Every week we post a list of social media and web job opportunities. While we publish a huge range of job listings, we’ve selected some of the top social media job opportunities from the past two weeks to get you started. Happy hunting!

More About: community management, infographics, job search series, salary, Social Media, trending

from Mashable! http://mashable.com/2012/02/12/social-media-salary-infographic/?utm_source=fe...

Who Will Win a Grammy? Twitter Predicts the Future [INFOGRAPHIC]

Who’s going to win a Grammy on Sunday? Webtrends gives us a sneak preview, using Twitter as its crystal ball. Check out the prognostications in this exclusive infographic created fresh this morning using hot trending data from millions of tweets all over the world.

Webtrends, a digital marketing and analytics agency, uses the Twitter API to search for all kinds of data concerning the Grammys — for example, topical hashtags, artist names and twitter handles, and album and song names. They started collecting this stuff a week ago, and the data has been flowing fast and furiously at a rate of 7,000 – 10,000 tweets per hour.

So who’s going to win the top Grammy honors? Who’s going to be the best dressed? Check out this infographic and you’ll be the first to know.

By the way, if this data isn’t fresh enough for you, you can also see what’s happening at this exact moment. To do so, take a look at the Webtrends Live Interactive Dashboard, which lets you put your finger on the pulse of Grammy-watchers the world over.

If you want to see more Twitter predictions of the Grammys, here’s NM Incite’s stats from Friday, and don’t miss our own entertainment editor Christina Warren, who will be live-blogging the Grammys from the Staples Center in Los Angeles on Sunday night.

How accurate do you think these predictions are? Let us know in the comments.


Infographic courtesy Webtrends

More About: grammys, infographics, Twitter, webtrends

For more Entertainment coverage:

from Mashable! http://mashable.com/2012/02/11/grammy-twitter-predictions/?utm_source=feedbur...

Famed Berklee Alums, Karmin, To Appear on Saturday Night Live Tonight [Videos]

After graduating from Berklee College of Music in 2008, Amy Heidemann and Nick Noonan took to YouTube, parodying Chris Brown’s hit “Look at Me Now” under the name Karmin. The video went viral, blowing up the blogosphere, and landed the duo on The Ellen DeGeneres Show. Tonight, the two have been given the chance to perform on another show: Saturday Night Live, where they’ll be performing alongside Zoey Deschanel at 11:30 p.m.

Described by their alma mater as a “fresh-sounding combination of pop music and hip hop,” Karmin’s signed their own label with Epic Records and L.A. Reid, and has opened for Gym Class Heroes and Lady Gaga. They’ve also performed with big-name acts like Pitbull, LMFAO, Foster the People and Avril Lavigne.

The duo chatted with Rolling Stone earlier this week, talking about their upcoming nuptials and beginnings at Berklee. Six years ago, Noonan was studying jazz trombone and, at one point, was on stage with Herbie Hancock. Later, when the school had Noonan perform in a Stevie Wonder tribute concert, they called Heidemann to sing vocals. The two met on that stage, and everything blossomed from there.

MTV’s been keeping their eyes peeled on Heidemann, as well, suggesting that Busta Rhymes has been schooled by a girl, writing, “She’s supplied a mile-a-minute tongue-twister [...] that will have your eyes crossing in 10 seconds flat.” She’s certainly got skill, and I may or may not have a girl crush on her.

You can check out some of Karmin’s most popular covers below, as well as some of their latest originals. Be sure to keep them on your radar, because who knows what’s next? From YouTube stardom to SNL, their career’s already off to a superstar start.

Photo courtesy of Billboard

from BostInno http://bostinno.com/2012/02/11/famed-berklee-alums-karmin-to-appear-on-saturd...

Bookle: Hands-on with the new Mac EPUB reader app

During Apple's January education event, one thing that many Apple bloggers were waiting for never appeared -- a version of iBooks for Mac. While that was a surprising omission, at least there's a new and well-implemented Mac book reader app that handles the EPUB format of most iBooks with ease and grace. Bookle (US$9.99) is a collaboration of Take Control Books publisher Adam Engst and Australian developer Peter Lewis of Stairways Software.

Bookle, which is available in the Mac App Store, reads non-DRM versions of EPUB books from the iBookstore. This is one of my few concerns about the app at this point, as many iBooks are copy-protected by digital rights management encryption. As Engst points out in the Introduction of the "Take Control of Bookle (1.0)" ebook that ships with the app, the main goal of this version of the app was to "get a program out quickly that can help you read our ebooks in the here and now." He admits that they may not be able to add support for reading DRM-encrypted ebooks, since "Neither Apple nor Amazon will license their DRM systems, and while Adobe will license Adobe Digital Editions, it's a six-figure cost...".

Getting that out of the way, let's take a look at the app. Bookle's icon is gloriously and beautifully designed (see image at top), which gives you an idea of the attention to detail given to the entire app. Bookle stores the EPUB files in the Application Support directory due to the Mac App Store sandboxing requirements, and books are easy to add to the Bookle library. You can use File > Open, drag the EPUB file onto the Bookle icon in the Dock or Finder, or just double-click the EPUB file.

Once the EPUBs are in the Library, they appear in a sidebar on the left side of the app's window. The sidebar of Bookle displays the list of ebooks and the table of contents of the ebook being read. At the top of the window are buttons to go back and forth in your reading history or up or down in chapters. There are also controls for changing the ebook's font and the font size, as well as setting the background color of the page.

Gallery: Bookle

As with many Lion apps, Bookle supports full-screen mode. I found this to be overkill on a 27" iMac, but it works very nicely on a smaller screen such as that on an 11" MacBook Air. If you close a window or quit the app, Bookle brings you right back to the last page you were reading when you open the book again. Bookle also has support for multi-touch gestures. Swiping two fingers left or right changes chapters when using a trackpad. There's also support for text-to-speech, so if you'd prefer to have an ebook read to you by your Mac, that's easy to do.

If you want to do side-by-side reading of two texts, all you need to do with Bookle is open each book in a separate window. I found this to be useful while making a comparison of two editions of one ebook, and I think it could also be very helpful if you're reading an ebook in one window and an explanatory text in the other window.

I mentioned earlier that I had a few concerns about Bookle -- one glaring omission is the inability to search a book for a specific word or phrase. I'd also like to see the ability to add bookmarks and make notations included in future versions of the app.

I'm sure that some TUAW readers will balk at Bookle's $10 price tag when Calibre is available for free. Frankly, I find Calibre to be a bloated (210.8 MB compared to Bookle's 4.1 MB) and poorly-implemented app that's horrible to use, and for reading ebooks it actually launches a separate ebook app called E-book Viewer. Bookle looks good, and is an excellent 1.0 implementation of a Mac ebook reader. I can't wait to see what the team of Lewis and Engst is able to add to Bookle in future versions.

Bookle: Hands-on with the new Mac EPUB reader app originally appeared on TUAW - The Unofficial Apple Weblog on Sat, 11 Feb 2012 18:00:00 EST. Please see our terms for use of feeds.

Source | Permalink | Email this | Comments

from TUAW - The Unofficial Apple Weblog http://www.tuaw.com/2012/02/11/bookle-hands-on-with-the-new-mac-epub-reader-app/

BoardProspects Raises $650K to Transform Your Board & Plans Party for Boston Startups

Raising money is one of the hardest tasks when trying to start a company. It takes an incredible amount of both time and effort, and there are no short cuts. Second to raising money is creating a valuable board for your company. Boards can be incredibly helpful or be the demise of your company from the beginning. BoardProspects, which has just raised $650k in seed capital, is aiming to transform how boards are created. Think of it as a hyper-focused LinkedIn aimed at bridging the gap between companies and potential board prospects (hence the name). BoardProspects will also provide educational resources and tools enhancing communication between boards and prospects to improve boardroom transparency, diversity, and service.

Angel investors who participated in the seed round include Mike Verrochi (managing partner at Blue Rock Ventures), Brendan McCarthy (managing director of Goldman Sachs) and Paul Sullivan (partner of Sullivan Tire).

“BoardProspects will help solve the challenges of building, joining and running effective boards by providing an online community for boards and prospects to make connections, and by providing an unmatched level of expert content, best practices, and educational resources for both individuals and organizations,” said Mark Rogers, BoardProspects Founder and CEO.  “It is our intention to develop BoardProspects into the premier destination for boards and prospects to publicly and privately exchange their expertise, skills, and qualifications.  With this round of funding, we now have the resources to accelerate product development and enhance the site’s functionality to make this vision a reality.”

The market which they are going after is apparently huge and one that I’m sure none of you have even considered.

There are more than 60,000 publicly traded companies in the United States that are required by law to have a board of directors. In the Fortune 1,000 alone, there are more than 1,100 directors currently serving that are over 70 years old, according to GMI Research, the leading independent provider of global corporate governance, ESG and accounting risk ratings and research. The number of vacancies for the nearly 1.6 million non-profit organizations in the United States is exponentially larger, with more than 1.8 million of these seats turning over each year. These dramatic numbers do not include the hundreds of thousands of private companies that could benefit from building a board of directors or an advisory board, recruiting valuable members, or applying best practices to improve board performance.

In addition to the funding announcement, Braintree based BoardProspects is giving BostInno readers a special treat. The first 250 people, who sign up here, will be part of their beta community and receive free admission to their exclusive launch party. The event will be at Space with a Soul on May 10th and will include a band, open bar.

from BostInno http://bostinno.com/2012/02/09/boardprospects-raises-650k-to-transform-your-b...

Daily Mac App: Text2Speech lets hear what you write in record time

OS X has a neat text-to-speech engine that'll read back what you write. You can start and stop TTS from the contextual menu or launch it using a keystroke that you set up in the Speech section of the System Preferences. Most of the settings for TTS are buried in the System Preferences which is inconvenient when you want to change a setting on the fly. If you need more flexibility than what OS X offers, you should take a look at Text2Speech. Text2Speech is a no-frills utility that uses OS X's underlying engine to read your text back to you.

The app gives you fine control over OS X's TTS engine in an easy-to-use UI. Once you paste your text into the app, you can change the voice that's speaking, change the speaking rate in small increments, and toggle the speech on and off with ease. It also tells you the character count of the passage, which is useful if you're writing a paragraph for a character-limited text box.

Text2Speech works well, with one caveat. When you start the TTS, it always starts at the beginning which is a minor annoyance. It would be useful if the app would let you choose the starting position. It would also be helpful if it remembered your position when you stop it in mid-passage. Despite these drawbacks, I still use Text2Speech every day. I find the convenience of being able to change settings on the fly outweighs these detractors.

If you want to try it yourself, Text2Speech is available for free in the Mac App Store. There's also a Pro version for US$3.99 that'll export your text to iTunes as an audio track or to your drive as an MP3 or AIFF file.

Daily Mac App: Text2Speech lets hear what you write in record time originally appeared on TUAW - The Unofficial Apple Weblog on Fri, 10 Feb 2012 11:00:00 EST. Please see our terms for use of feeds.

Source | Permalink | Email this | Comments

from TUAW - The Unofficial Apple Weblog http://www.tuaw.com/2012/02/10/daily-mac-app-text2speech-lets-hear-what-you-w...

Roku adds BBC iPlayer channel as it starts shipping in the UK

Just as Netflix is nearly ubiquitous on media streaming platforms in the US, BBC's iPlayer is pretty much a default app in the UK so it's no surprise to see it show up on Roku's boxes just as they start shipping across the Atlantic. As detailed in the press release after the break, those shiny new Roku LT and Roku 2 XS hockey pucks are well on their way to punters who've shelled out £50 / £100, respectively, with over 40 available channels. Unfortunately, that announcement doesn't extend to global iPlayer support outside the UK and Republic of Ireland, so we'll have to catch up on Inside Men some other way.

Continue reading Roku adds BBC iPlayer channel as it starts shipping in the UK

Roku adds BBC iPlayer channel as it starts shipping in the UK originally appeared on Engadget on Fri, 10 Feb 2012 03:41:00 EDT. Please see our terms for use of feeds.

Permalink   |   | Email this | Comments

from Engadget http://www.engadget.com/2012/02/10/roku-adds-bbc-iplayer-channel-as-it-starts...