Re:invent 2024 Predictions Revisited

Alright, re:invent 2024 is a wrap, now let’s see how I did on my predictions!

VectorDB (7/10)

I predicted a managed pinecone or similar vectorDB, but what we got instead was a whole host of RAG Knowledge Base options. Kendra GenAI Index is a new index type for AWS Kendra (their enterprise search system. We also got knowledge base support for graphs and for structured data sources like Redshift and Lakehouse. Not a perfect hit, but I think I was on the right track here. By the way, I also learned that Neptune now supports both graph and vector queries over the same data set, which is super powerful.

Carbon Explorer (2/10)

Ok, this was a big fail for me for two reasons. First, there were no announcements about it. In fact, all the major cloud players seem to be avoiding talking about carbon impact as much as possible because GenAI has destroyed their Green initiatives in a terrible way. However, it’s a double fail, because I just realized that a version of this tool was announced in March 2022. It’s still a pretty basic capability at this point, but it does exist for those who care.

FOCUS imports (0/10)

I predicted (more like dreamed about) being able to import custom FOCUS data into AWS cost and usage reports and Cost Explorer. No luck, but I can dream.

Zero ETL to Iceberg (7/10)

Well, this one is hard to say. As of today, you still can’t zero-ETL into Iceburg format on S3, though I was told by multiple sources that this capability would be coming in Q1. What we did get were two closely related announcements that show the direction this is moving. First was S3 Tables, a new bucket type for “fully-managed” Iceberg tables in S3. Unfortunately, I can’t find any evidence that s3 tables are supported as a zero-etl target.

The second announcement of note is SageMaker Lakehouse. Lakehouse provides an Iceberg API on top of a data lake, supporting raw S3 and Redshift Managed Storage, thus far. Currently, SageMaker Lakehouse supports Zero-ETL from a host of different sources, but RDS is not one of them. The team was focused on third-party sources for re:invent, but have already added dynamoDB and will get to RDS soon.

If it sounds like there is a lot of overlap between SageMaker Lakehouse and S3 tables, that is because there is. I’m told they are working to merge the two over the next year or so, but the idea is for S3 Tables to be an opinionated easy-button while LakeHouse is a highly customizable powerhouse. For example, S3 tables automates the compaction process of managing Iceberg, while LakeHouse requires you to schedule compaction and tune various parameters regarding how and when it runs. I really like the approach of having both a simple default tool and the ability to grow beyond it, so I’m hopeful they will maintain both options.

The biggest news here is that AWS is making Iceberg their table format of choice for zero-etl. All zero-etl services will produce Iceberg data, which should make adding zero-etl support for S3 tables, Redshift, lakehouse, and any other Iceberg compatible target very easy.

Managed Delta Live Tables (0/10)

Every indication is that AWS is all-in on Iceberg. Databricks has strong influence on both Iceberg and Delta Live Tables, and has indicated that they want to evolve both into a single standard over time. It appears AWS is going to ride out that evolution rather than support both formats in the near-term.

Fargate GPU (0/10)

This one makes me sad because it seem so simple. I guess there are just not enough customers wanting to run GPU heavy workloads on Fargate.

Database Savings Plans (0/10)

All indications point to AWS pushing serverless databases as the answer to cost management rather than traditional RDS. With many customers wanting the benefits of serverless while still having rather static workloads, I do think some kind of savings-plan is likely in the future, but it will likely only cover Aurora serverless flavors (including the new DSQL).

Enhancements to AWS Organizations (2/10)

Well, it seems that this was considered a distraction to the AI heavy re:invent announcements. The progress that was made on organizations over the weeks leading into re:invent cannot be ignored. AWS is taking this seriously. They just decided to announce all of it prior to the big event. Sadly, support remains a relic of the old single-account model, and I see little evidence of that changing anytime soon.

Amazon Q for Operations (9.5/10)

I predicted (wished for) enhancements to Q that would turn it into an operational ninja, and I pretty much hit this one right on the head. Unexpectedly, the new capability is part of Q Developer, not Q for business, which in retrospect makes a lot of sense. This is EXCELLENT as Q Developer continues to dramatically outperform Q for business in general usefulness. What I did not see coming was AWS embracing partners so heavily in regards to GenAI agents. Not only is Q Developer able to be your operational troubleshooter extraordinaire, but several other vendors have launched co-branded incident agents. PagerDuty, Datadog, and Wiz are leading the pack, but I’m sure many others will be fast followers. I haven’t used any of these yet, and I’m sure they will be far from perfect, but over the next few years they will really transform the landscape for incident response.

On top of the huge help they’ll bring to troubleshooting, I imagine we’re very near the point where an agent can listen in and watch the troubleshooting effort and provide an interface to stakeholders. Never again will a team have to stop trying to fix the problem in order to update the product manager, CIO, CEO, sales team, and whatever other random executive wants to yell at everyone “Why isn’t this fixed yet” while wasting the time of the very people trying to fix the problem! It almost makes me miss being on call…. HA, no, not really.

What was your favorite announcement from Re:invent?

← Back to Blog