|By CMS News Desk||
|February 25, 2010 06:00 AM EST||
IBM on Thursday announced it is working with the British Library on a project that will preserve and analyse terabytes of information on the Web before it is lost forever.
The new analytics software project, called IBM BigSheets, helps extract, annotate and visually analyse vast amounts of Web information using a Web browser. IBM's new technology prototype is helping the British Library archive and preserve massive amounts of Web pages, and then unlock the virtual door to its archives for generations to come.
IBM's new analytics technology is helping the British Library speed up the archival process before Web data is lost forever. The Web is rapidly changing with new pages created every day causing an explosion of data that is disappearing almost as quickly as it is published. Recent research estimates the average life expectancy of a Web site is just 44 - 75 days. In turn, every six months, 10 percent of Web pages on the UK domain are lost.
"IBM BigSheets does for big data what spreadsheets did for personal computing," said Rod Smith, vice president, Emerging Internet Technologies, IBM. "Within a matter of minutes, researchers, academics and students will be able to search many terabytes archived Web pages from the UK domain, analyse the results and effortlessly visualise the results of the search."
Preserving Data for Generations to Come
Each year more than six million searches are generated by the British Library online catalogue, and nearly 400,000 people visit the British Library reading rooms, looking for information. The British Library receives a copy of every physical publication produced in the UK and Ireland, amounting to more than 150 million maps, manuscripts, musical scores, newspapers and magazines that it must archive. Beyond just the physical assets, the British Library has been archiving selected Web pages from the UK domain since 2004. With BigSheets, users of the Library will be able to access vast archives of historic Web sites, and easily research and analyse their queries and visualise the results of the search.
"We estimate the UK Web space will contain over 11 million Web sites by 2011. To take on the enormous challenge of capturing this content, we need a system capable of taking the UK Web Archive to Web-scale," said Helen Hockx-Yu, Web Archiving Programme Manager, The British Library. "IBM can help us analyse the web archive containing millions of pages and unlock embedded knowledge which otherwise is difficult to discover using traditional search methods."
Whether it's someone interested in their own genealogy or a student working on a project for school, people need help making sense of this growing sea of information on the Web. For example, the 2005 election marked the first attempts by UK politicians to use the Web as a campaigning tool. With the use of Web campaigns expected to explode during the 2010 election, the 2005 collection will enable researchers studying the evolution of politics and the Web to access hugely valuable primary source material.
BigSheets: The Technical Foundation
This year, the amount of digital information is expected to reach 988 exabytes which is the equivalent to a stack of books from the Sun to Pluto and back. The Web is exploding with data and business professionals want to access that data -- both structured and unstructured -- to get better insights to their business. IBM BigSheets is an insight engine that helps businesses get insights from really large data sets easily and in a timely manner. By building on top of the Apache Hadoop framework, IBM BigSheets is able to process large amounts of data quickly and efficiently.
IBM BigSheets is a new technology prototype. Users can explore and generate new data insights using a Web application and then the IBM software publishes Web 2.0 standard data feeds which can be searchable by British Library patrons.
BigSheets is an extension of the mashup paradigm that integrates gigabytes, terabytes, or petabytes of unstructured data from Web-based repositories; collects a wide range of unstructured Web data stemming from user-defined seed URLs; extracts and enriches that data using an unstructured information management architecture; and lets the user explore and visualise this data in specific, user-defined contexts. For example, users can see search results in a pie chart and look at the data in a tag cloud.
SYS-CON Events announced today that Embotics, the cloud automation company, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Embotics is the cloud automation company for IT organizations and service providers that need to improve provisioning or enable self-service capabilities. With a relentless focus on delivering a premier user experience and unmatched customer support, Embotics is the fas...
Oct. 22, 2016 09:15 AM EDT Reads: 684
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
Oct. 22, 2016 09:15 AM EDT Reads: 1,407
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service.
Oct. 22, 2016 08:30 AM EDT Reads: 923
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
Oct. 22, 2016 08:15 AM EDT Reads: 418
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Oct. 22, 2016 08:15 AM EDT Reads: 3,750
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
Oct. 22, 2016 07:45 AM EDT Reads: 3,093
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
Oct. 22, 2016 07:00 AM EDT Reads: 2,251
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
Oct. 22, 2016 06:45 AM EDT Reads: 380
The Internet of Things (IoT), in all its myriad manifestations, has great potential. Much of that potential comes from the evolving data management and analytic (DMA) technologies and processes that allow us to gain insight from all of the IoT data that can be generated and gathered. This potential may never be met as those data sets are tied to specific industry verticals and single markets, with no clear way to use IoT data and sensor analytics to fulfill the hype being given the IoT today.
Oct. 22, 2016 06:30 AM EDT Reads: 2,226
@ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
Oct. 22, 2016 06:15 AM EDT Reads: 11,214
If you had a chance to enter on the ground level of the largest e-commerce market in the world – would you? China is the world’s most populated country with the second largest economy and the world’s fastest growing market. It is estimated that by 2018 the Chinese market will be reaching over $30 billion in gaming revenue alone. Admittedly for a foreign company, doing business in China can be challenging. Often changing laws, administrative regulations and the often inscrutable Chinese Interne...
Oct. 22, 2016 06:00 AM EDT Reads: 1,342
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
Oct. 22, 2016 05:45 AM EDT Reads: 475
SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.
Oct. 22, 2016 05:00 AM EDT Reads: 2,474
In the next forty months – just over three years – businesses will undergo extraordinary changes. The exponential growth of digitization and machine learning will see a step function change in how businesses create value, satisfy customers, and outperform their competition. In the next forty months companies will take the actions that will see them get to the next level of the game called Capitalism. Or they won’t – game over. The winners of today and tomorrow think differently, follow different...
Oct. 22, 2016 04:30 AM EDT Reads: 808
One of biggest questions about Big Data is “How do we harness all that information for business use quickly and effectively?” Geographic Information Systems (GIS) or spatial technology is about more than making maps, but adding critical context and meaning to data of all types, coming from all different channels – even sensors. In his session at @ThingsExpo, William (Bill) Meehan, director of utility solutions for Esri, will take a closer look at the current state of spatial technology and ar...
Oct. 22, 2016 03:30 AM EDT Reads: 1,666
The Open Connectivity Foundation (OCF), sponsor of the IoTivity open source project, and AllSeen Alliance, which provides the AllJoyn® open source IoT framework, today announced that the two organizations’ boards have approved a merger under the OCF name and bylaws. This merger will advance interoperability between connected devices from both groups, enabling the full operating potential of IoT and representing a significant step towards a connected ecosystem.
Oct. 22, 2016 02:45 AM EDT Reads: 1,152
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
Oct. 22, 2016 01:30 AM EDT Reads: 9,615
SYS-CON Events announced today that Streamlyzer will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Streamlyzer is a powerful analytics for video streaming service that enables video streaming providers to monitor and analyze QoE (Quality-of-Experience) from end-user devices in real time.
Oct. 22, 2016 01:15 AM EDT Reads: 900
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
Oct. 22, 2016 01:15 AM EDT Reads: 815
@ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
Oct. 22, 2016 01:00 AM EDT Reads: 8,184