InfaWorld2006

Friday, May 26, 2006

Day After Stats

I just thought that I'd share with everyone the day-after stats on hits to these reflections. Here's the map.


It was fun to see how they spread out across the country when people got home from the conference. I hope that people are getting some value from my reporting and reflections on the conference.

Thursday, May 25, 2006

Closing Keynotes - Don Tapscott

Don Tapscott, author of this year's inspirational book, The Naked Corporation, gave an incredibly rousing and inspirational speach about the changing nature of business in today's world. One of the key take aways I got from his keynote was the fact that people growing up today (in the range of 4 to 28 in his estimatation - the "echo boomers") are a huge generation of people who are not just technically literate, but technology for them is just like air. These people, he finds, have great BS detectors -- you can't pull the wool over their eyes. As an employer or product or service provider to them, the only option is to be honest with them. They're hard to fool and they won't trust you if they find you to be dishonest. Hence, The Naked Corporation.

Being just barely outside his defined age range, I was ready to jump up and give him a standing ovation for his insight into my values. Transparency. Openness. Willingness to share insight and services. Collaboration. This are all things that I love and value in the companies that I see emerging on the net. I thought my only peers were other Open Source fanatics, but in his estimation, it's the entire generation right behind me.

His take on the situation is that the most successful companies in the current environment are going to be those that are willing to bare their all to the world. Those who are willing to admit their mistakes before someone else catches them. Those who are willing to share their services in such a was as to lower barriers to collaboration between their partners and drive the overall value across companies higher than any company could drive their value alone. Mutual openness benefiting everyone involved. Lowering the cost of transactions, which are the barrier to entry for collaboration and openness.

As you might expect, this ties in closely with the challenges of data integration and software/business process outsourcing. Moving data and processes outside of a corporation increases transaction costs, but also increases the potential value of that expertise to the company. The challenge is to lower that transaction cost so that the real value of BPO/SaaS arrangements can be achieved.

Yeah! Way to go Don! I can't wait to read my newly signed copy.

Wednesday, May 24, 2006

Night on the Town

The Evening Adventure in San Francisco was great. I missed the first bus to depart from the hotel for Fisherman's Warf and Ghirardelli Square, but really had what was probably a better view hanging on the outside of an old trolly. I think I took a whole roll of film on the ride... good thing it's a digital camera.



Dinner at McCormick & Kuleto's was nice. A bit crowded, but there was plenty of food and, as you might expect, plenty of decadent chocolate cheesecake. Informatica made a nice choice.


Pier 39
was a good visit. Turns out that sea lions, if you didn't already know, are pretty loud and kind of stinky. But they're fun to watch when they push each other off of the little rafts.


One more morning left of the conference. I'm not attending any of the post conference education (PC8, PC Metadata Management, or DQ Strategy and Planning) although I have no doubt that they would be at least as valuable as the pre-conference TDWI session I attended. (I prefer to get back to my wife and daughter a day earlier.)

Integration as a Service for BPO and SaaS

I've said it already, but I think the most progressive part of the conference so far has been the discussion of how enterprises that use business process outsourcing (BPO) and software as a service (SaaS) vendors will be able to integrate that data into their decision making processes and business event streams. May people probably don't realize just how prevalent BPO is, even in places that think they don't outsource.

When you start to think about it in a certain way, BPO and SaaS is really just another type of specialization. We don't all farm our own food - we source that activity to farms. We don't generate our own electricity - we ask the electric company to do that. Car manufacturers don't manufacture rubber, or tires - they get different companies to do that for them.

We're not talking about the oursourcing of application development to offshore companies, here. We are talking about looking to companies that are already experts in an industry to provide other businesses with that expertise or a product that results from that specialized expertise. The question of integrating more sophisticated information from that process into our own enterprises, though, is the challenge that Sohaib and others have been talking about. I think it's very exciting.

Wednesday Breakout Sessions

Highlights from the Wednesday breakout sessions so far.

Pushdown Optimization
One of the newest (and most promising in my opinion) features of PowerCenter 8 is pushdown optimization - the feature that allows you to choose parts of a mapping that could be "pushed down" onto the source or target database. For instance, you might have a simple mapping that just pulls data out of one table, does some lookups, an aggregation, and loads that back into a database table. In our case, we do things like this alot to build data marts. Well, we've got a new data warehouse platform that has much more powerful hardware and we're consolidating those datamarts onto the single database instance. Sure, I could lose all my metadata and replace those mapping with an SQL statement... but wouldn't it be better to just have PowerCenter optimize the workload by pushing it onto the database. Of course, I can do this selectively. If I want to do something off database I can, but this technology option really opens up PowerCenter's traditional ETL model into the ELT space that some database vendors might suggest you take a closer look at for certain types of operations.

Speaking of Teradata, they presented a session about Active Data Warehousing and Active Enterprise Intelligence. The trend that Teradata is discussing in the industry, is the ability for a data warehouse (which presents an intelligence opportunity) to become actively enabled in such a way that they intelligence that businesses derive from that system can do so in a more real-time, tactical, operational, or automated way. The idea behind the talk was an opportunity curve: As time increases through the chain of business event, data acquisition, information discover, and business response; the potential value of a that response decreases, eventually to the point that the business has completely missed the opportunity. Reducing that path from business event to reponse is one key to making active decisions. The second key is to make intelligent decisions... business intelligence is embedded in a mature data warehouse. Therefore active enterprise intelligence comes from the process of activating an enterprise warehouse. Sounds simple enough... except when you see the pulsating, flashing slides in the Teradata presentation!

As Sohaib likes to say: this is hard work and you all deserve raises!

Morning Keynotes

Another good round of keynote speeches this morning.

Girish Pancha, EVP of Products gave us an overview of the Informatica roadmap for the future, covering two new projects that go into the 2008+ timeframe: DaVinci and Galileo. I'll cover more on these later when I can post up some pictures, but DaVinci is focused around enabling business analysts and the design/specifications phase of the data integration project lifecycle. Imagine a consistent mechanism that can be used to define data integration needs at a business level, incorporating on-the-fly data profiling/discover/cleansing; and being able to use those to generate actual data integration processes, in a traceable way, that can be deployed onto the PowerCenter platform. Galileo is all about architecture, specifically the trend toward data services in a service oriented architecture. Galileo gives architects the ability to specify service levels, deployment environments (e.g. SQL or webservices) latency requirements, security needs and drive those down into how the services being specified and build are deployed into the environment. Maybe a bit grandiose and high level for some practitioners, but exciting stuff!


In addition, we had Gary Beach, Group Publisher of CIO magazine, give an insightful (if somewhat dry, perhaps) interview with three CIO's live on stage.

Hit Report

I thought I'd share some statistics about the blog's popularity. The past 24 hours have attracted over 160 distinct visitors from around the globe and 350 page views. The diversity of locations is a testament to Informatica's global presence:


Please share the site with your Informatica coworkers and contacts!

Tuesday, May 23, 2006

Partner Fair

All the usual suspects are the partner fair. I figure it isn't polite to name only a few names, so I'll just link to the list.


Although you can't tell, this is a demo of the new Java transformation available in PowerCenter 8. I think that was one of the better additions to PC. It provides easier access to an environment that allows you to easily implement more classic procedural logic: looping, recursion, etc. I've never really jumped on the Java bandwagon - I grew up as a real programmer with real programming languages like 68k assembler, Scheme, and C. Embedding Java in databases seemed like a silly idea to me, but I think I'm finally coming around to the whole idea of embedding Java in other tools. (I still call it the COBOL of the 90's, though.)

Quick Personal

Sunday was my birthday. When I got back from the partner fair tonight there was a package in my hotel room - chocolate covered strawberries!


Thanks!

My wife's the one who should be getting great gifts, though.
She's single-parenting for the week while I'm in SF!

Session Highlights (3 of 3) - Miscellaneous

I didn't attend a specific session during the third breakout window today, so I'll just give you a laundry list of other sessions that I picked out handouts from and found interesting -- not to imply that any particular session wouldn't have been interesting, I just can't be 7 places at once and don't want to highlight everything. You can see a list of all the sessions on Informatica's website.

Best Practices: Data Administration and Quality
Dan Linstedt had a record-breaking (based on my complete lack of data) audience for his session on error tracking and reporting, data quality management, and accountability best practices.

Service-Oriented Data Integration Architecture
SOA, ESB, EAI, ETL, EII... I'm so glad we've got a new acronym that only has two letters: DI. Still, I think the SOA approach to architecture definition is a very powerful idea right now. The challenge, I'm sure, for most organizations is taking those ideas and finding the right way and right time to start implementing them.

Achieving High Availability in a Server Grid
Another fun idea that I'm really curious about right now. One of the things that you hear about organizations doing with their data warehouse is making a mission critical application - part of the core business, fed in real-time. Well, if the technology that you're using to feed a high-availability, real-time platform isn't also high-availability... that doesn't really work, does it? The whole integration chain has to be high-availability to achieve real-time high-availability out the front end.

Session Highlights (2 of 3) - Informatica as an Enterprise Platform

Sara Stonner's presentation on Implementing a PowerCenter-based Enterprise Data Integation Platform was different than what I expected, and so incredibly valuable (not that I thought it wouldn't be valuable; it was just valuable in a different way).

She told the story of Morgan Stanley going from a position where various departmental IT organziations were all evaluating ETL tools to the creation of a centralized, leveragable, managed platform on which all of those IT organizations could develop and deploy their data integration solutions. I was inspired by the story she told. There are probably many organization in which PowerCenter is a centralized solution in one particular IT area - usually Data Warehousing. So many other IT organizations can take advantage of that same DI platform, though, where they might otherwise end up getting data unloads and writing sqlldr scripts, or building a simple Java app to copy data from one database to another. With the right internal PR, support, training, and opportunity, those same organizations can just take advantage of the appropriate tool (PowerCenter) that's already deployed and hosted. Her message about setting up an environment where the barriers to entry were EXTREMELY low (while still securely managing the risk of having so many groups using a single platform) is one the key take aways. In her organization the results were astonishing -- an adoption rate that exceeded their 2-year outlook within just a few months.

Powerful ideas!

Session Highlights (1 of 3) - Real-Time Informatica

Obviously I can't attend all the breakout sessions myself, but I'll give you a highlight of the sessions that I attended and a little bit about some of the one's I wish I could have attended also.

Session 1: Real-Time Informatica - Case Studies
As you might expect, the whole notion of real-time (or right-time) data integration is one of the challenges that DI professionals struggle with. Most DI folks come from a batch background, be it via an ETL platform or mainframe unloads or bulk SQL execution. From all ETL vendor's perspective, DI is an extension of Data Warehousing, and Data Warehousing definitely started with batch jobs: monthly summaries done after the books were closed to provide fast access to summary-level results. So, here we are, ETL professionals extending into more generalized Data Integration along with a business drive for faster answers to those questions. One of the things I commented on during the TDWI session yesterday was that business users are now looking to have answers "before the numbers are ready." For certain kinds of applications you might estimate a billing amount just so you can get the answer before the order goes through an actual bi-weekly billing cycle.

Real-time Informatica is all about access to other application interfaces in a trickle-feed / stream / real-time / event-based / message-oriented way. MQ Series, Tibco, JMS, WebServices. One of the things I'm really interested in is looking at the PowerCenter platform as a place to host back-end web services. That seems like a powerful way to achieve reusability not just within the PowerCenter platform, but out to any other application that you're building. I find it encouraging to know that there are customers out there taking advantage of not only connectors to real-time sources like MQ, but also who are serving up DI logic as a callable web service that will execute business logic embedded in a transformation on a row-by-row, call-by-call basis.

Not cutting edge anymore.

Keynote Highlights

As always, Informatica's created a great atmosphere for the keynotes.



President and CEO Sohaib Abbasi gave his usual speach about the complexity of the data integration landscape and how Informatica, as the premier DI company, is about so much more than data warehousing. This year's theme, though, seems to be focused around integrating data from BPO and SaaS operations. Informatica announced a PowerCenter Connect for salesforce.com that it plans to follow with SaaS connects for other providers as well. Sohaib also shared that Informatica's own software roadmap leads toward implementing PowerCenter as an on demand service as well, integrated into applications like salesforce.com. (To be honest, what surprised me is what a big operation salesforce.com has become. The support SFA for Cisco!) Sohaib also shared a great demo of how Similarity's Data Cleansing integrates into PowerCenter.



This transitioned perfectly, of course, into the next keynote speaker, Marc Benioff, Chairman and CEO of salesforce.com. Marc focussed more on Web2.0 and how the "business web" is becoming more and more prevalent with applications like salesforce.com. Well defined business operations can be packaged up and outsourced as a software service over the web. He did live demos of salesforce.com as well as other applications like Adobe's pdf creator and Writely and various Google maps mashups. Obviously on the cutting edge of the business web, Marc shared some powerfully exciting insights about the future of business operations and how applications might integrate in the future.


Finally, Accenture was back again with Royce Bell and Shari Rogaliski to present some powerful insights about information management. (I always enjoy Royce's presentation style and British sense of humor.) After gracing us for five minutes with a slide of a statue of a naked man... Royce spent most of the speach alluding to Malcolm Gladwell's "Blink", and how important it is to bring information to users in context rather than simply flood them with information. It gives them the ability build the intuition used in decision making from a continuous exposure to real facts.

Monday, May 22, 2006

TDWI PreConference Education

The Data Warehouse Institute offered great prices on two different pre-conference education opportunities:
  • Data Integration Techniques
  • Data Warehousing Architectures

I attended the Data Integration Techniques workshop, presented by Dave Wells, director of education for TDWI. He did an absolutely outstanding job on the workshop and brought together a lot of different ideas about what it takes to do data integration projects of any kind: data warehousing, data consolidation, data synchronization, data integration from mergers, etc. Though the general ideas aren't really new to veteran data integration/warehousing people, the organizational tools that Dave presented were extremely valuable:
  • Definition of a continuous life cycle for data integration activities, reflecting the fact that none of us ever can really build something and walk away -- the BI/DW space is too connected to business rules and business units that are ever changing. (In fact, one of the things that BI is supposed to do is drive change in the business!)
  • A taxonomy of terms and ideas to use when looking at the activities within the data integration lifecycle.
  • A methodology for (1) identifying and qualifying source systems for use in an integration project; (2) mapping sources to targets (or targets back to sources as he prefers) at the various levels of abstraction from business entity to physical table/file to field level; (3) classifying types of transformations that are potentially needed in the integration activity; (4) digging up and validating business rules using continuous data profiling; (5) evaluating and measuring data quality as a formal metric over the data integration activities; and (6) working with business to put it all together into a successful and iterative lifecycle. (Wow! What a mouthful.)
I highly recommend taking a look at the TDWI courseware and taking the time to attend some of these workshops. Even if you already know everything, there's always a different way of looking at things that can help you tweak your own system for approaching data integration challenges -- and most of us don't know anything close to everything!

Thanks Dave!

You Need Data Integration

This year's theme is all about data integration. There have been ads around the theme throughout the year in various trade publications, and now everyone who attends can be a walking advertisement, too:


San Francisco Views

Now that I've got my digital camera, I thought I'd share a couple quick views from the hotel. This is my first trip to San Francisco, so, of course, I'm fascinated by all the sights.

My limited view of the Golden Gate Bridge:


My view of the harbor:


My view of Barry Bonds NOT hitting 715 against the Cardinals -- though the Giants did really beat up the Cardinals tonight:

Leadership Council Meeting

The first item on the agenda for those of us who are User Group leaders was the annual Leadership Council Meeting. This was my first Leadership Council Meeting, and I have to say that it far exceeded my expectations -- not that they were low expectations by any means. This all day event was an opportunity for chapter leaders to meet eachother face to face; share stories, tips, and techniques about organizing our chapter events; and learn some new facilitation techniques that we can take back and apply in our own meetings.

We also got to have some time with Sohaib Abbasi, the Chairman of the Board, CEO, and President of Informatica. I have to say that I was awed by his candor with our group of user group leaders. He walked into the room and took the mic for what I expected to be a 10 minute feel-good speach. Instead he spoke for less than 2 minutes and asked us "what do you want to know?" He answered every question the group threw at him to a reasonable satisfaction: from "what do you think about these ideas to make education/certification programs more targeted/relevant/accessible" to "what's your vision of where data integration is going" to "how is Informatica addressing changes in their competitive landscape." I think it's a great indication of how much respect that he has for the Informatica user community -- realizing as he said in his own words that "people won't believe the things that he says about Informatica, they have to hear it from real users."

Also in attendance at the council meeting, and equally willing to share their thoughts and answer questions, were various members of the Informatica leadership team representing customer advocacy, program management, and product management.

Sunday night was a dinneer and reception for the user group leaders. (I have to admit that I didn't attend. It was birthday, and I went out to Cliff House with a buddy from work, upon recommendation from my wife.) I'm sure that a good time was had by all!

Introductions

Welcome to my Informatica World 2006 blog! I decided to keep a journal of my comings and goings at this year's Informatica World user conference to help connect those people who can't attend the conference with the value that comes out of it. If you're a general Informatica developer or someone interested in purchasing Informatica software or already a user group member who couldn't attend this year, I hope you'll find the content that I post to be valuable. Maybe even good ammunition to use in convincing your management to send you next year. If you're not here (or are here, for that matter) and want to hear about something specific, post a comment and I'll try to put up a post about it.

As for me... I'm just another Informatica user and advocate. I happen to be the co-chair of the St. Louis Informatica User Group, which means that I got the opportunity to come out to San Francisco early this year for the user group leadership summit (look for the next post). I work as a data warehousing and integration engineer at Express Scripts, Inc -- no, not as in "scripting language." We're a major pharmaceutical benefit management and mail order pharmacy company. Meaning that, like so many companies, we have lots of great data warehousing and data integration opportunities.

So, throughout the week, look for daily posts about the various sessions and workshops and activities that I get the opportunity to attend!