It has already been over week since I got back from Seattle, where I attended the Open Source Summit NA in the beautiful convention centre. In the same trip I also presented at the Data Engineer Things group’s inaugural meeting in Bellevue. Both events were very successful for Starburst as sponsor of my trip, Trino as the open source SQL query engine I talk about, and myself as a learning participant in many talks and great conversations.
Data Engineer Things
The initial impetus for my trip was an invite from the Data Engineer Things group looking for a speaker for the first meeting in Seattle. Upon the request from the organizers, I decided to put a Big Data Whirlwind Tour, with a perspective from Trino as an open source SQL query engine, together. The meeting at the Databricks office in Bellevue started up with numerous chats with other data engineers before the presentations. Over 100 people enjoyed food, drinks, presentations, and learning together. After my presentation we raffled off some hardcopies of my book Trino: The Definitive Guide and continued networking and exchanging ideas. Judging from the reactions at the event and afterwards, everyone had a great time. Check out the recap from Veronika, from Yaakov, and from Pallavi.
Open Source Summit NA
In addition to presenting at the DET group, I attended the Open Source Summit NA conference from the Linux Foundation, that really is an aggregation of multiple smaller conferences. Following are some of my thoughts.
Security and Open Source
Various security and open source initiatives within the OpenSSF umbrella and beyond are quite interesting. Trino is already doing a good job managing dependencies luckily, but it is always good to see
The Redis license change caused the effort around Valkey as another industry-supported fork and community version. These decisions are becoming a tradition it seems, and the closing party seems to lose each time. After the event IBM bought Hashicorp which means that IBM is now involved in both Terraform and OpenTofu, as well as Vault and OpenBao. I am sure this is going to be part of some interesting conversations.
The XZ Utils backdoor hack reiterated again that maintainers need better ways to get funded and supported to avoid burning out. In the long run a trust relationship and network between projects needs to evolve as well. Open source projects in general still need a lot more support to be long term viable, and get the support they need as a critical infrastructure for all systems. We need to move from viability of a project to good vitality, projects need to thrive and not merely survive. Personally I strongly believe that independent foundations need actually hire maintainers, instead of just providing marketing, legal, and other framework stuff for their projects.
And now for some smaller notes:
- Software Bill of Material SBOM efforts become more and more important. I will look at that for Trino.
- SPDX 3.0 is finally released
- Keynote chat with Linus Torvalds was entertaining and educational at the same time.
- Azure Innovation engine demo for executable docs was impressive. Worth checking out more maybe.
- Open source dependency project from Google looks interesting https://deps.dev/
- PyTorch seems very widely used and better support for it in the Trino ecosystem would be a good step to get closer to AI workloads
AI everywhere
Not surprisingly, AI and all the related buzzwords around it were a reoccurring theme at the conference. Open source and AI causes some very interesting and difficult issues in terms of licensing. A lot of this is currently undefined and various claims of open source models are technically incorrect. I fully expect this to be tumultuous for quite some time. The sometimes conflicting and sometimes aligned interests of different groups in terms of copyright, trademark, licensing, and other legal issues together with the desire to be remunerated will cause lots of upheaval. Different projects and stakeholders are already creating systems on the full spectrum from fully transparent and open source to completely closed. We live in exciting times .. again.
Smaller specific notes around the idea that everyone wants a piece of the pie:
- LF launches AI & Data Generative AI Commons?!
- Intel launched the Open Platform for Enterprise AI with the LF.
- The effort from DataBricks to get DBRX off the ground was massive, and it was great to learn about the details and their plans
- The demo of running a Gemma model from Google locally on laptop was interesting to see how simple it could become over time.
- The Open Language Model OLMO from Allen AI looks like an interesting and very open initiative with significant participation
Catching up
Beyond the great keynotes and many sessions I attended, I would also like to thank the many Trino community members and other open source hackers and enthusiasts I met in the hallways, after sessions, and in the exhibition hall.
See you at the next event…
PS: Also see the related post on LinkedIn.