Filed under: reader

Because Hadoop isn’t perfect: 8 ways to replace HDFS

Hadoop is on its way to becoming the de facto platform for the next-generation of data-based applications, but it’s not without flaws. Ironically, one of Hadoop’s biggest shortcomings now is also one of its biggest strengths going forward — the Hadoop Distributed File System.

Within the Apache Software Foundation, HDFS is always improving in terms of performance and availability. Honestly, it’s probably fine for the majority of Hadoop workloads that are running in pilot projects, skunkworks projects or generally non-demanding environments. And technologies such as HBase that are built atop HDFS speak to its versatility as storage system even for non-MapReduce applications.

But if the growing number of options for replacing HDFS signifies anything, it’s that HDFS isn’t quite where it needs to be. Some Hadoop users have strict demands around performance, availability and enterprise-grade features, while others aren’t keen of its direct-attached storage (DAS) architecture. Concerns around availability might be especially valid for anyone (read “almost everyone”) who’s using an older version of Hadoop without the High Availability NameNode. Here are eight products and projects whose proprietors argue can deliver what HDFS can’t:

Cassandra (DataStax)

Not a file system at all but an open source, NoSQL key-value store, Cassandra has become a viable alternative to HDFS for web applications that rely on fast data access. DataStax, a startup commercializing the Cassandra database, has fused Hadoop atop Cassandra to provide web applications fast access to data processed by Hadoop, and Hadoop fast access to data streaming into Cassandra from web users.

Ceph

Ceph is an open source, multi-pronged storage system that was recently  commercialized by a startup called Inktank. Among its features is a high-performance parallel file system that some think makes it a candidate for replacing HDFS (and then some) in Hadoop environments. Indeed, some researchers started looking at this possibility as far back as 2010.

Dispersed Storage Network (Cleversafe)

Cleversafe got into the HDFS-replacement business on Monday, announcing a product that will fuse Hadoop MapReduce with the company’s Dispersed Storage Network system. By fully distributing metadata across the cluster (instead of relying on a single NameNode) and not relying on replication, Cleversafe says it’s much faster, more reliable and scalable than HDFS.

GPFS (IBM)

IBM has been selling its General Parallel File System to high-performance computing customers for years (including within some of the world’s fastest supercomputers), and in 2010 it tuned GPFS for Hadoop. IBM claims the GPFS-SNC (Shared Nothing Cluster) edition is so much faster than Hadoop in part because it runs at the kernel level as opposed to atop the OS like HDFS.

Isilon (EMC)

EMC has offered its own Hadoop distributions for more than a year, but in January 2012 it unveiled a new method for making HDFS enterprise-class — replace it with EMC Isilon’s OneFS file system. Technically, as EMC’s Chuck Hollis explained at the time, because Isilon can read NFS, CIFS and HDFS protocols, a single Isilon NAS system can serve to intake, process and analyze data.

Lustre

Lustre is a an open source high-performance file system that some claim can make for an HDFS alternative where performance is a major concern. Truth be told, I haven’t heard of this combination running anywhere in the wild, but HPC storage provider Xyratex wrote a paper on the combination in 2011, claiming a Lustre-based cluster (even with InfiniBand) will be faster and cheaper than an HDFS-based cluster.

MapR File System

The MapR File System is probably the best-known HDFS alternative, as it’s the basis of MapR’s increasingly popular — and well-funded — Hadoop distribution. Not only does MapR claim its file system is two to five times faster than HDFS on average (although, really, up to 20 times faster), but it has features such as mirroring, snapshots and high availability that enterprise customers love.

NetApp Open Solution for Hadoop

OK, the NetApp Open Solution for Hadoop isn’t so much an HDFS replacement as it is an HDFS improvement, according to NetApp and early partner Cloudera. The offering still relies on HDFS, but it reenvisions the physical Hadoop architecture by putting HDFS on a RAID array. This, NetApp claims, means faster, more reliable and more secure Hadoop jobs.

This might be a good place to say rest in peace to two other HDFS alternatives that are effectively no longer with us — KosmosFS (aka CloudStore) and Appistry CloudIQ Storage. The former was created by Kosmix (since bought by @WalmartLabs) and released to the open source world in 2007, but no longer has an active community. The latter was an attempt by Appistry in 2010 to get a piece of the Hadoop pie with its computational storage technology, but the company has since switched its focus from selling the technology to providing high-performance computing services based on it.

Feature image courtesy of Shutterstock user Panos Karapanagiotis.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.


from GigaOM http://gigaom.com/cloud/because-hadoop-isnt-perfect-8-ways-to-replace-hdfs/?u...

So You Want to Be a Programming Intern, Part 1: Bullets


[Pictured, from left to right: Josh, Grace, Ingrid, Julian, Shane]

Let me let you in on a rather shameful secret: up until my senior year of college, that “Proficient in Microsoft Office” bullet on my resume meant to convey my computer literacy? Not true. I used two pieces of software: TextEdit and WordPress.

You know who could back up that bullet? Our interns. But, given that they know Python, Objective-C, BASIC,  LISP, LaTeX, JavaScript, CoffeeScript, MATLAB, UnityScript, Ruby, Java, C, C++, C#, Bash and and and… there is no way it would make the resume-worthy cut.

In part one of our getting-to-know-us series, interns Josh, Grace, Ingrid, Julian, and Shane give you some bullets of a different sort.

The questions:

1. Where do you go to school?
2. What do you study?
3. Give us a unique/random fact of your choosing.

The answers:

Josh Grinberg:

  1. Stanford University
  2. I’m still deciding on a major, but I know for sure I’m really excited about Mechanical Engineering and Computer Science, and also possibly Bioengineering.
  3. I like performing in circus shows. My favorite trick is juggling torches, while balancing on a rolling globe and reciting 100 digits of pi.

Grace Yue Gong:

  1. Worcester Polytechnic Institute (WPI)
  2. Computer Science
  3. One of my hobbies is that I maintain a tropical fish aquarium with live plants.

Ingrid Hagen-Keith

  1. Olin College of Engineering
  2. Since I just completed my freshman year, I haven’t really decided what I want to major in, though I’m pretty sure that it will involve Computer Science and perhaps some Design.
  3. I spilled glowing goo on my laptop senior year during a chemistry class. It was really embarrassing AND it was filmed so I got to relive it over and over…

Julian Ceipek:

  1. Olin College of Engineering
  2. I am studying how to learn, with an emphasis on Computer Science and Design.
  3. My favorite designer is Bret Victor. If you haven’t, you should watch Inventing on Principle, a fabulous talk about having a powerful ideal motivating your work.

Shane Skikne

  1. Olin College of Engineering
  2. I haven’t declared but I am considering either Electrical and Computer Engineering or Engineering with Computing or Robotics… or maybe Systems, but definitely not BioEngineering (probably).
  3. For my entire life, I have written the capitol Q incorrectly. Instead of putting the dash in the bottom left side of the circle, I put it in the top right. No one ever told me I was doing it wrong until college. Don’t ask me how, but I believe that it all started because Quailman (yes, from Doug) wore that belt on his head.

from BostInno http://bostinno.com/channels/so-you-want-to-be-a-programming-intern-part-1-bu...

Twitter is building a media business using other people’s content

As Twitter continues to build out new features such as “expanded tweets” and curation-based services like its NASCAR editorial offering, it has become pretty obvious where the company is headed: it has given up on being a utility built on open APIs and is becoming a media company, powered by a rapidly-growing advertising platform. Twitter also has one big advantage that other media companies don’t: the fact that it doesn’t have to produce any of the content, but simply acts as a filter for information from other sources. Its success will be determined by how well it strikes a balance between helping other media entities and competing with them.

Twitter CEO Dick Costolo has repeatedly resisted suggestions that Twitter is a media entity, perhaps in part because the company wants to be seen as a partner for traditional media companies like newspapers and TV networks. But as its advertising business grows larger — thanks in part to reports from advertisers of “staggering” levels of engagement with ad features like promoted tweets — and it continues to tighten the rules on its API to squeeze out third-party developers, it becomes more and more clear that Twitter’s future is based on controlling access to the information flowing through the network as closely as possible.

Twitter’s future lies in capturing more user attention

The “expanded tweets” feature, which is currently being used by a number of media companies such as the New York Times (and GigaOM), is a glimpse of what this future looks like: if a tweet contains a link to an article or webpage that uses special tags, users can expand the tweet to show an excerpt of the original — or a video or photo or other content — inside the Twitter app or in a tab on Twitter.com. In an interview this week with the Los Angeles Times about the company’s plans, Costolo said something interesting about how Twitter sees its role. According to the newspaper, he said:

Twitter is heading in a direction where its 140-character messages are not so much the main attraction but rather the caption to other forms of content.

So instead of being a simple information utility that distributes 140-character messages, many of which contain links to other kinds of content, Twitter wants to become something more like a destination. Instead of sending people who click those links away to other websites and media outlets, the company wants to hang onto users for a little longer by showing them excerpts of that content inside its own frame — and it wants to do this primarily so that it can capture more of their attention, since that’s what advertising-based media players do. But then who ultimately gets to retain the value of that attention, Twitter or its media partners?

As I’ve argued before, this isn’t all that different from what the New York Times and other traditional media outlets are trying to do: namely, to walk the tightrope between pushing people away by giving them links to content elsewhere, and trying to hold onto them long enough to show them ads. Even Google is struggling with this dilemma — the company used to be known for the speed with which it sent users elsewhere, but over the past few years it has been spending more and more time trying to capture the attention of users and hold onto them for longer with services like Google+ and its Search Plus Your World feature. Facebook is also trying to be a partner for media companies, but to some extent is a competitor for attention and ad dollars as well.

Send users away, or try to keep them inside your app?

In a recent interview with Om as part of paidContent 2012 in New York, Betaworks CEO John Borthwick made a good point about the tension between pushing people away and holding them in. As he put it:

The grain of Twitter moves with the grain of the web; it’s similar to Google in that it’s a discovery platform that pushes you out. Now they’ve said “We’re going to be a media company” — but the grain of that moves in the opposite direction, to try and keep people in one place, to almost create a walled garden.

The big challenge for Twitter, then, is to somehow manage that transition properly. How does it capture — and to some extent control — more and more content from outside sources, so that it can hang onto users for longer and show them ads, without irritating the media companies and other entities that it relies on for that content? This is why critics such as blogging pioneer Dave Winer argue that Twitter is more of a competitor for media companies than a partner, because it is trying to do fundamentally the same thing that media outlets are trying to do, and it is doing so by using content that belongs to others.

This isn’t all that different from what services like Flipboard and Zite do: they also take content from other sources and aggregate and filter it, and they also walk a fine line by showing excerpts of an original source, or in some cases showing the whole thing inside their own frame (something Zite was threatened with a lawsuit over early in its career, before it was acquired by CNN). Some media companies are seeing the companies as partners — as the New York Times has with Flipboard and the Wall Street Journal has with Pulse — but they could also be seen as competitors in many ways.

Is Twitter a friend and helper to media companies, or a growing rival for both attention and ad dollars? Is it more focused on sending users away or on keeping them inside its walled garden? Those are the questions that anyone interested in Twitter’s future — either as a service or as a business — needs to think about.

Post and thumbnail images courtesy of Flickr users George Kelly and jphilip

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.


from GigaOM http://gigaom.com/2012/07/11/twitter-is-building-a-media-business-using-other...

Intuit releases Mint QuickView in the Mac App Store

Personal finance site Mint.com (owned by Intuit) has released its first OS X app in the Mac App Store. Called Mint QuickView, the app allows Mint.com users to quickly take a peek at their finances. The app isn't a full-fledged personal finance app like Quicken or iBank, rather it's a small window (read: mini browserish) to your Mint account. You access it from OS X's menu bar to see the balances on all your Mint accounts, recent transactions, and net income. The app also displays badged notifications when you have new transactions or one of your accounts needs tending to.

Owners of the new MacBook Pro will also be pleased to find the app is Retina display ready. Mint QuickView is a free download and requires Mac OS X 10.6 or later.

Intuit releases Mint QuickView in the Mac App Store originally appeared on TUAW - The Unofficial Apple Weblog on Wed, 11 Jul 2012 09:00:00 EST. Please see our terms for use of feeds.

Source | Permalink | Email this | Comments

from TUAW - The Unofficial Apple Weblog http://www.tuaw.com/2012/07/11/intuit-releases-mint-quickview-in-the-mac-app-...

‘White Collar’ Cranks Up the Heat With Interactive Social TV Game




When White Collar returns to the airwaves later tonight, fans will be invited to go along for the ride in a new interactive social TV game.

The game and experience, dubbed Neal's Stash, is the final stage of a multi-faceted social TV and transmedia campaign that Ford sponsored for the USA Network show.

White Collar is about the partnership between slick con man Neal Caffrey and FBI agent Peter Burke, as the two work together to solve crimes. Although Neal has seemingly put his life of cons behind him (for now), he has various's stashes from past cons still out in the wild.

The campaign kicked off back in January with Mozzie's Mission, a game…
Continue reading...

More About: neal's stash, social tv, usa network

from Mashable! http://mashable.com/2012/07/10/white-collar-neals-stash/?utm_source=feedburne...

Dropbox Doubles Storage Space for Paid Users, Gives Their Friends 100GB Free Trials [Dropbox]

Good news for Dropbox Pro users: Starting today, you'll have twice the amount of storage space, for the same cost. Instead of 50GB, you'll have 100GB to play with; instead of 100GB, it'll be 200GB of space. You also can send others a 100GB 3-month trial for the online sharing and syncing service. More »


from Lifehacker http://lifehacker.com/5924812/dropbox-doubles-storage-space-for-paid-users-gi...

VIM 101: a quick-and-dirty guide to our favorite free file editor

In the world of text editors, there's a plethora of options out there. If you've ever Googled "how to edit HTML sites" or some such, you know what we mean. Allow us, then, to introduce you to VIM, a free website editor that offers many of the same features as Adobe Dreamweaver, and runs on just about every desktop platform. Specifically, it comes by default on the vast majority of Linux distributions, OS X and commercial Unix systems. (It's available to install on Windows, too.) And did we mention it's free? That command line UI isn't necessarily self-explanatory, though, so join us after the break for a quick crash course to help you get started.

Continue reading VIM 101: a quick-and-dirty guide to our favorite free file editor

VIM 101: a quick-and-dirty guide to our favorite free file editor originally appeared on Engadget on Tue, 10 Jul 2012 15:00:00 EDT. Please see our terms for use of feeds.

Permalink   |  sourceVIM  | Email this | Comments

from Engadget http://www.engadget.com/2012/07/10/vim-how-to/

Google+ for iPad hits the App Store, invites you to Hangout with your Apple slate

Image

As promised way back in late-June at I/O, Google+ now has has its very own fully iPad-supported app. Available now via iTunes, the app offers up some tablet-centric features, like the ability to drag posts from your stream for sharing, streaming Hangouts to a TV via AirPlay and expanding posts with a pinch to add comments. The updated Google+ iPhone app, meanwhile, lets users create and manage Google+ Events. The app can be downloaded now in the source link below.

Filed under: ,

Google+ for iPad hits the App Store, invites you to Hangout with your Apple slate originally appeared on Engadget on Tue, 10 Jul 2012 13:47:00 EDT. Please see our terms for use of feeds.

Permalink   |  sourceGoogle Blog, iTunes  | Email this | Comments

from Engadget http://www.engadget.com/2012/07/10/google-for-ipad-hits-the-app-store-invites...

SiliconDust HDHomerun Prime CableCARD tuners hit Woot for $130

If you've been thinking about building an HTPC without spending a lot of money then first of all we have a post that can help you with that (and a comment section of folks saying they can do even better), and second, it might be time to grab one of SiliconDust's HDHomeRun Prime TV tuners. The three tuner CableCARD device can turn your computer into a cable box, and Woot is selling brand new units for just $130 (plus $5 shipping) in this morning's one day sale, a decent discount form the $180 - $200 prices we found elsewhere. Still not convinced this is for you? Check out our hands-on with the device or a quick video trailer embedded after the break. Oh, and if you need a new HDTV to plug it into, Woot's also running a sale on some LG LCDs with 3D and connected apps for $650 / $900 (47-inch / 55-inch).

Continue reading SiliconDust HDHomerun Prime CableCARD tuners hit Woot for $130

SiliconDust HDHomerun Prime CableCARD tuners hit Woot for $130 originally appeared on Engadget on Tue, 10 Jul 2012 02:59:00 EDT. Please see our terms for use of feeds.

Permalink   |  sourceSellout.Woot  | Email this | Comments

from Engadget http://www.engadget.com/2012/07/10/silicondust-hdhomerun-prime-cablecard-tune...

20 TV Shows With the Most Social Media Buzz This Week




Animation flooded this week's social TV chart, with mainstays like Spongebob Squarepants, Family Guy and The Simpsons topping the bill.

Anderson Cooper 360 gained attention on the web, following the news that the show's host Anderson Cooper had revealed his homosexuality in a statement to The Daily Beast. "The fact is, I’m gay, always have been, always will be, and I couldn’t be any more happy, comfortable with myself and proud," Cooper wrote in an email to columnist Andrew Sullivan.

Following the news break, Twitter and Facebook were abuzz; some users broadcast their support, while others debated the newsworthiness of Cooper's revelation.

SEE ALSO: Anderson Cooper: ‘The Fact
Continue reading...

More About: Entertainment, infographics, Social Media, social tv, social tv charts, Trendrr, TV

from Mashable! http://mashable.com/2012/07/09/social-media-tv-chart-7-9/?utm_source=feedburn...