Welcome!

CMS Authors: Mehdi Daoudi, Rishi Bhargava, Harry Trott, Xenia von Wedel, Carmen Gonzalez

Related Topics: Microservices Expo, Cognitive Computing , Agile Computing, @CloudExpo, CMS, @BigDataExpo

Microservices Expo: Article

Sorting Out Data Acquisition

Synthetic APIs approach improves fragmented data acquisition for Thomson Reuters’ content sharing platform

The next BriefingsDirect innovator interview examines the improved data use benefits at Thomson Reuters in London.

Part of a discussion series on how innovative companies are dodging data complexity through the use of Synthetic APIs, learn here how, from across many different industries and regions of the globe, inventive companies are able to get the best information delivered to those who can act on it with speed and at massive scale.

Here to explain how improved information integration and delivery can be made into business success, we're joined by Pedro Saraiva, product manager for Content Shared Platforms and Rapid Sourcing at Thomson Reuters. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: Kapow Software is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:

Gardner: You first launched Thomson Reuters content-sharing platform over four years ago after joining the company in 1996. And the platform there now enables agile delivery of automated content-acquisition solutions across a range of content areas. What are you delivering and to whom?

Saraiva: It's actually very simple. We're a business that requires a lot of information, a lot of data because our business is information -- intelligence information, and we need to do that in a cost-efficient manner. Part of that requires us to have the best technology. When we started four years ago, one of the most obvious patterns that we found was that we had a lot of fragmentation of our content acquisition processes where they were based, who was doing them, and more importantly, what processes they were following or not following.

Saraiva

The opportunity that we immediately saw was to consolidate it all, not just around the central capability, but into an optimal capability, with real experts around it making it work and effectively creating a platform as a service (PaaS) for our internal experts in each content area to perform their tasks just as usual, but faster, better, more reliably, and more consistently.

Fundamentally, we are a platform for web-content acquisition. And that is part of our content-shared platform because it's all part of a bigger picture, where we take content from so many sources and many different kinds of sources, and not just web.

Content management

I don't know the exact percentage, but I would guess that about half of what we do is content management, rather than site technology, per se. And a lot of those content management tasks are highly specialized because that's the only way we're going to add value. We're going to understand the content, where it comes from, what it means, and we are going to present it and structure it in the best possible way for our customers.

So, the needs of our internal groups and internal content teams are huge, very demanding, and very specialized. But they all have certain things in common. We found many of them were using Excel macros or some other technologies to perform their activities.

We tried to capture what was common, in spite of all that diversity, to leverage the best possible value from the technology that we have. But also, from our know-how, expertise, and best practices around how to source content, how to be compliant with the required rules, and producing consistent, high-quality data that we could trust, we could claim to our customers that they could trust our content because we know exactly what happened to it from beginning to the end.

Gardner: Thomson Reuters is a large company. Tell us how large, and tell us some numbers around the number of different units within the company that you are providing this data to.

Saraiva: We have about 50,000 employees worldwide in the majority of countries. For example, our news operations have reporters on the ground throughout the world.

We have all languages represented, both internally and in terms of our customers, and the content that we provide to our customers. We're a truly diverse organization.

It takes shape in the vast number of different teams we have specializing in one kind of content.

We have a huge number of individual groups organized around the types of customers that we serve. Are they global? Are they regional? Are they local? Are they large organizations? Are they small organizations? Are they hedge funds? Are they fund managers? Are they investment banks? Are they analysts? We have a variety of customers that we serve within each of our customer organizations around the world.

And that degree of specialty that I mentioned earlier, at some point, has to take shape. It takes shape in the vast number of different teams we have specializing in one kind of content. It may be, perhaps, just a language, French or Chinese. It may be fundamentals, versus real-time data. We have to have the expertise and the centers of excellence for each of those areas, so that we really understand the content.

Gardner: You had massive redundancy in how people would go about this task of getting information from the web. It probably was costly. When you decided that you wanted to create a platform and have a centralized approach to doing this, what were the decisions that you made around technology? What were some of the hurdles that you had to overcome?

Saraiva: We were looking for a platform that we would be able to support and manage in a cost-effective manner. We were looking for something that we could trust and rely on. We were looking for something that our users could make sense of and actually be productive with. So, that was relatively simple.

 

The biggest challenge, in my opinion, from the start, was the fact that it's very hard to take a big organization with an inherently fragmented set of operating units and try to change it, because trying to introduce a single, central capability. It sounds great on paper, but when you start trying to persuade your users that there's value to them in in migrating their current processes, they'll be concerned that the change is not in their interest.

Demonstrating value

And there is a degree of psychology at work in trying to not only work with that reluctance that all businesses have to face, but also to influence it positively and try to demonstrate that value to our end users was far in excess to the threat that they perceived.

I can think of examples that are truly amazing, in my opinion. One is about the agility that we've gained through the introduction of technology such as this one, and not just the user of that technology, but the optimal use of it. Some time ago, before RSA was used in some departments, we had important customers who had an urgent, desperate need for a piece of information that we happened not to have, for whatever reason. It happens all the time.

We tried to politely explain that it might take us a while, because it would have to go through a development team that traditionally build C++ components. They were a small team and they were very busy. They had other priorities. Ultimately, that little request, for us, was a small part of everything we were trying to do. For that customer, it was the most important thing.

The conversation to explain why it was going to take so long why we were not giving them the importance that they deserved was a difficult conversation to have. We wanted to be better than that. Today, you can build a robot quickly. You can do it and plug it into the architecture that we have so that the customer can very quickly see it appearing almost real time in their product. That's an amazing change.

But ultimately, most importantly, we needed the confidence that we could get our job done.

Gardner: What was the story behind your adoption of this?

Saraiva: We spent some time looking at the technologies available. We spoke with a number of other customers and other people we knew. We did our own research, including a little bit of the shotgun kind of research that you tend to do on the Internet, trying to find what's available. Very quickly, we had a short list of five technologies or so.

All of them promised to be great, but ultimately, they had to pass the acid test, which was evaluation in terms of our technical operations experts. Is this something that we are able to run? And also in terms of the capabilities we were expecting. They were quite demanding, because we had a variety of users that we needed to cater to.

But ultimately, most importantly, we needed the confidence that we could get our job done. If we are going to invest in a given technology, we want to know that it can be used to solve a given kind of problem without too much fuss, complexity, or delay, because if that doesn't happen, you have a problem. You have only partially achieved the promise, and you will forever be chasing alternatives to fill that gap.

Kapow absolutely gives us that kind of confidence. Our developers, who at first had a little bit of skepticism about the ability of a tool to be so amazing, tried it. After the first robot, typically, their reaction was "Wow." They love it, because they know they can do their job. And that's what we all want. We want to be able to do our jobs. Our customers want to use our products to do their jobs. We're all in the same kind of game. We just need to be very, very good at what we do. Kapow gave us that.

Critically important

With Kapow, it was a straightforward process. We just click, follow the process that really mirrors a complex workflow in the flow chart that we designed, and the job is done.

In terms of the rapid development of the solutions, it was at least a reduction from several months to weeks. And this is typical. You have cases where it's much faster. You have cases where it's slower, because there are complex, high-risk automation processes that we need to take some time to test. But the development process is shortened dramatically.

Gardner: We were recently at the Kapow User Summit. We've been hearing about newer versions, the Kapow platform 9.2. Is there anything in particular that you've heard here so far that has piqued your interest? Something you might be able to apply to some of these problems right away?

Saraiva: A lot of what we've been doing and focusing on over the last four years was around a pattern whereby we have data flowing into the company, being processed and transformed. We're adding our value, and it's flowing out to our customers. There is, however, another type of web sourcing and acquisition that we're now beginning to work with which is more interactive. It's more about the unpredictable, unplanned need for information on demand.

The main advantage of a cloud-based service running Kapow would be in freeing us from the hassle of having to manage our own infrastructure.

There, interestingly, we have the problem of integrating the button that produces that fetch for data into the end-user workflows. That was something that was not possible with previous versions of Kapow or not straightforward. We would have to build our own interfaces, our own queues, and our own API to interface with the robo-server.

Now, with Kapplets it all looks very, very straightforward because we can easily see that we could have an arbitrary optimized workflow solution or tool for some of our users that happens to embed a Kapplet that allows a user to perform research on demand, perhaps on the customer, perhaps on a company for the kind of data that we wouldn't traditionally be acquiring data on a constant fixed basis.

Gardner: Any advice that you might offer to others who are grappling with similar issues around multiple data sources, not being able to use APIs, needing a synthetic API approach?

I've been amazed at what is possible with technologies such as Kapow.

Saraiva: I suppose the most important message I would want to share is about confidence in technology. When I started this, I had worked for years in technology, many of those years in web technology, some complex web technology. And yet, when I started thinking about web content acquisition, I didn't really think it could be done very well.

I thought this is going to be a challenge, which is partly the reason why I was interested in it. And I've been amazed at what is possible with technologies such as Kapow. So, my message would be don't worry that technology such as Kapow will not be able to do the job for you. Don't fear that you will be better off using your own bespoke C++ based solution. Go for it, because it really works. Go for it and make the most of it, because you will need it with so much data, especially on the Internet. You have to have that.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

@ThingsExpo Stories
SYS-CON Events announced today that Outscale, a global pure play Infrastructure as a Service provider and strategic partner of Dassault Systèmes, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2010, Outscale simplifies infrastructure complexities and boosts the business agility of its customers. Outscale delivers a secure, reliable and industrial strength solution for its customers, which in...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
SYS-CON Events announced today that A&I Solutions has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 1999, A&I Solutions is a leading information technology (IT) software and services provider focusing on best-in-class enterprise solutions. By partnering with industry leaders in technology, A&I assures customers high performance levels across all IT environments including: mai...
Every successful software product evolves from an idea to an enterprise system. Notably, the same way is passed by the product owner's company. In his session at 20th Cloud Expo, Oleg Lola, CEO of MobiDev, will provide a generalized overview of the evolution of a software product, the product owner, the needs that arise at various stages of this process, and the value brought by a software development partner to the product owner as a response to these needs.
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software in the hope of capturing value in IoT. Although IoT is relatively new in the market, it has already gone through many promotional terms such as IoE, IoX, SDX, Edge/Fog, Mist Compute, etc. Ultimately, irrespective of the name, it is about deriving value from independent software assets participating in an ecosystem as one comprehensive solution.
SYS-CON Events announced today that EARP will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. "We are a software house, so we perfectly understand challenges that other software houses face in their projects. We can augment a team, that will work with the same standards and processes as our partners' internal teams. Our teams will deliver the same quality within the required time and budget just as our partn...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
SYS-CON Events announced today that Tappest will exhibit MooseFS at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. MooseFS is a breakthrough concept in the storage industry. It allows you to secure stored data with either duplication or erasure coding using any server. The newest – 4.0 version of the software enables users to maintain the redundancy level with even 50% less hard drive space required. The software func...
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
SYS-CON Events announced today that Systena America will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Systena Group has been in business for various software development and verification in Japan, US, ASEAN, and China by utilizing the knowledge we gained from all types of device development for various industries including smartphones (Android/iOS), wireless communication, security technology and IoT serv...
SYS-CON Events announced today that Outscale will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Outscale's technology makes an automated and adaptable Cloud available to businesses, supporting them in the most complex IT projects while controlling their operational aspects. You boost your IT infrastructure's reactivity, with request responses that only take a few seconds.
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real r...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Busine...
In order to meet the rapidly changing demands of today’s customers, companies are continually forced to redefine their business strategies in order to meet these needs, stay relevant and continue to see profitable growth. IoT deployment and development is integral in this transformation, and today businesses are increasingly seeing the value of investing their resources into IoT deployments. These technologies are able increase ROI through projects such as connecting supply chains or enabling sm...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
Everywhere we turn in our industry we can find strong opinions about the direction, type and nature of cloud’s impact on computing and business. Another word that is used in every context in our industry is “hybrid.” In his session at 20th Cloud Expo, Alvaro Gonzalez, Director of Technical, Partner and Field Marketing at Peak 10, will use a combination of a few conceptual props and some research recently commissioned by Peak 10 to offer a real-world consideration of how the various categories of...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus intern...