Leeroy Jenkins Jr Available

Leeroy Jenkins Jr. is now mostly working.  It’s a 64-core AMD 3990x Threadripper with 128GB of memory.  It’s currently accessible via http://leeroy2.defthm.com/

There are still some things I’m playing around with, e.g., how many processes make sense to run at once and moving it from under my desk to the closet.  However, it should be generally available.

I’ve moved every build over to Leeroy Jr.  They still exist on Leeroy Sr, but, with the exception of the master build, they’re disabled.

The plan is, at some point, to ship the machine to Kestrel, where it will be hosted.  I’ll be hosting it in my closet until then.  When it’s shipped, I’ll reactivate the builds on Leeroy Sr.  I probably won’t play around with the leeroy vs leeroy2 subdomain until after it’s been installed at Kestrel, so keep the link handy.

I’ll post additional updates (e..g, about downtime) in the ACL2 Community slack channel #continuous-integration-leeroy-jenkins.  If you’d like an invite to the slack server, please let me know.

IMG_20200726_011510.jpg

IMG_20200726_004649

Jenkins vs Teamcity vs Others

A Search for an Alternative to Jenkins

Every couple years, I tend to evaluate the ACL2 community’s build server.  Is it meeting the community’s needs?  Do we need more cores?  Is it secure, or am I likely to receive an email from the cloud provider notifying me that they think the machine is mining cryptocurrency?  This time around, I’ve wondered if we should use a different Continuous Integration (CI) solution.

One hackernews discussion provided some perspective: https://news.ycombinator.com/item?id=19781907.  There are also lots of search results for “alternatives to Jenkins.”  Forrester had a graphic in a report in 2017 that Gitlab cites.  It’s 3 years old at this point, but it at least does a nice job introducing the contenders:

forester-report-2017

Notes about Our Application and Community

Here are some relevant notes on our community.

  • We are an open-source community and would qualify for many CI provider’s “free” tiers.
  • Our ACL2 community development process is relatively light-weight and there is little tolerance for adding complexity to that process.  We have github issues and respond when new-comers create issues, but we hardly use them ourselves.  We would almost never be interested in CI pipelines, as they would add overhead to our currently light-weight contribution process.  Many of us use pull requests, and many of us push directly to one of the testing branches.
  • Re-using artifacts (built books/libraries) across builds saves a lot of build time.  We probably can’t reasonably use any solution that doesn’t easily allow us to re-use previously built artifacts.
  • Our build’s internal dependency graph is insane, and there’s little hope of making any CI software understand it.  My goal is not to cleanup the build — it’s to get/keep something working in a small amount of time.

 

The Status Quo: Jenkins

Jenkins used to be the cat’s meow.  It opened up exploring continuous integration by being accessible and free, and we’ve been using it since 2014.

Since then, I’ve come to learn how painful Jenkins’ security story is.  Maybe we’re willing to trust that the Jekins core is tightly maintained.  Unfortunately, I’m guessing it might be reasonably possible to compromise one of the lesser well-known plugins.  And since lots of plugins that we “need” depend on lots of other plugins, we’re basically stuck absorbing plugins with goofy and obscure names (an indicator, to me, of a potential lack of scrutiny).  Right now we’ve got 30 plugins installed, of which only a handful were explicitly installed by me (the rest being dependencies).  That’s a lot of attack surface!

Even if we assume that there’s no intentional compromise, Jekins and its plugins have lots of accidental security vulnerabilities.  It’s actually really good that Jenkins has a framework for indicating that a plugin, or Jekins itself, has a security vulnerability and needs to be updated.  The problem is that it’s an extra step to automatically update Jenkins and its plugins.  That being said, https://stackoverflow.com/questions/7709993/how-can-i-update-jenkins-plugins-from-the-terminal seems to have a good script for automating that process.

This lack of security is a little more acceptable when Jenkins is only accessible behind a firewall. However, it’s probably less acceptable when the cost of an intrusion via source-code is more expensive (i.e., when money is involved).  This lead me to look for alternatives.

Teamcity (Jetbrains)

I’ve had a deep appreciation for Jetbrains software and the accompanying sales model for a while now.  Their Java, GoLang, and C++ IDEs are really good.  And I appreciate that they’re willing to let the customer stop a product subscription after a year but still give the customer a perpetual license to the last version they paid for.  That’s really respectable.  Furthermore, I’m fed up enough with Jenkins s.t. I’d love to pay someone $100 a year to manage the security story for me.

Here’s what I learned about Teamcity

  • It was started in 2006.  That’s a long time ago.  Some parts seem well-modernized and others seem outdated.
  • It’s kind of clunky to configure — it’s hard to find the right blanks to fill out.  It always takes a while to learn a new interface, but this seemed particularly painful to me.
  • The way some features are implemented and explained ends up making things more confusing.  For example, you can easily monitor “testing-*” branches with a single build.  At first it seems strictly better (and indeed, for many readers, this is a feature not a liability), but supporting this feature adds an extra check-box/blank or two.  Once the blank that implements this feature is understood by the user, it’s not an issue.  However, it slowed me down a bit.  There are some other Teamcity features that were also clunky to learn.
  • Figuring out how to correctly merge changes from a testing branch into a main branch took a significant amount of time.
  • The documentation exists but could use improvement.
  • A lot of the stackoverflow posts on Teamcity topics tend to be from 2012-2013.  This is problematic for a couple reasons: (1) the information provided is outdated and (2) it suggests Teamcity isn’t used as much anymore. (Disclaimer: It could also mean that Teamcity is a perfect product and no one has questions about it anymore, but that seems unlikely.)
  • When I posted a question to the Teamcity support forums, there was no reply.  This suggests that Teamcity is not a high priority for Jetbrains.  Furthermore, the situation this question described was likely indicative of a bug.  Searching for similar posts indicated that other people had encountered similar problems (many years ago).  The fact that I was still hitting that problem is pretty bad.
  • There isn’t a nice presentation for a “matrix” build.  I was able to share the configuration by using “templates”, but boy is it clunky to navigate each configuration.  This is relevant because we run the cross-product of {acl2_vanilla acl2_parallel acl2_real acl2_parallel_real} x {CCL SBCL GCL}.  This means when the user logs in, they’re overwhelmed with ~12 of our least important builds and have to search for the other ~4 “main” builds that are much more important.  Maybe there’s a work-around related to the project hierarchy for that.
  • It doesn’t have a light-weight mechanism for presenting build artifacts (e.g., *.cert.out files) to the user.  I’d need to setup an artifact server (like Artifactory).  I’m guessing open-source artifact servers exist, but I’d rather skip the complexity.  We’re a humble open-source project, not an enterprise.  It’d be kind of neat to have a history of build artifacts (perhaps for easily determining what runes used to be used to prove a theorem), but it’s likely pretty useless for our community’s development process.
  • Evidently there are plug-ins for Teamcity that may help with some of the above problems, but I’ve learned to dislike the concept.
  • Teamcity provides a nice mechanism for emailing notifications upon successful/unsuccessful builds.  It will also restart failed builds automatically (for a specified number of tries) and only email regarding the build failure once.  Unfortunately, there’s no easy way to include the log from a “build step” in that email.  This means I’d need to add my own mail command to the build script, which is work I don’t want to do.  More importantly, it would add code to the project that someone else would then have to maintain if/when I get too busy — i.e., if the smtp server changes, then someone has to go find the code that contains that server name, which is harder than searching through a [well-designed] gui.

Impressions of Other Solutions

Here are my first impressions from other solutions.  They’re exactly that — first impressions.  I don’t want to spend an exhaustive amount of time to completely survey the space — I’m just trying to solve a problem in a reasonable amount of time.

Azure

Looking at the Azure cloud-ops solution, it looks rather simplistic, and I’m just guessing that it’s not MSFT’s main solution anymore.  They should probably be pouring their resources into Github, and they’re probably doing that.  Also, the Azure DevOps page seems to be pretty cloud oriented — e.g., they reference sharepoint on their page.  Further investigation would be needed to actually make a judgement.

TravisCI

The free tier for TravisCI has always appealed to me.  Here are the reasons we end up not using it.

  • There seems to be some sort of magical 120 minute limit to builds in TravisCI.   Since our full build takes well-beyond that, we wouldn’t qualify for the free TraviCI tier.  W.r.t. rolling out TravisCI ourselves, that limit can probably be changed.
  • It also seems that builds in TravisCI don’t re-use artifacts from previous builds.  Re-use of previous artifacts is a highly-desired feature in our situation, so this rules out TravisCI.
  • More investigation would be needed to really rule it out.

Gitlab

I really like gitlab.  Whereas github seemed to be resting on its first-to-market advantage, gitlab has been more innovative and also applied price pressure to github (my personal impression, but seemingly confirmed by Forrester).  That being said, it’s hard for me to imagine a world where it makes sense to use gitlab for the CI and github for actually hosting the repository, pull requests, etc.  Since the community strongly prefers to not move the repository, I’ve ruled out gitlab for now.

Github CI

At this point, I’ve run out of steam.  I’ve put this after Gitlab, but chronologically, I investigated it at the end.  It looks like CI is integrated via “Github Actions”.  Of the options that’s not Jenkins, this is the one I would spend time investigating next.  However, I’m guessing it will have some of the drawbacks mentioned in other solutions, so I’m going to save myself the time for now.

Cloudbees

  • “Built on the most widely used automation server in the world Jenkins™ – CloudBees CI (Core) provides flexible, governed CI/CD you can trust”.  Okay, they’ve got my attention, as that’s pretty much exactly what I want.
  • Upon further reading, it seems like CloudBees CI is maybe more about managing multiple jenkins servers than making Jenkins itself more reliable.
  • I can’t get to a trial of their product without “scheduling a demo.”  Umm, help is typically appreciated, but the fact that I can’t just download and run it is a red flag.  Off the cuff, it seems likely to be too expensive.  Plus, I don’t want to deal with all of the communication overhead.

CircleCI

It looks pretty slick but isn’t a great fit for us.  Here’s the two main things I learned.

  • They let you start with a VM, as opposed to a docker image.  This is convenient for me, since I haven’t dockerized all of the tools necessary to perform the complete build.  However, I have built a VM.
  • “CircleCI automatically runs your pipeline in a clean container or virtual machine, allowing you to test every commit” — this is actually a deal-breaker for us, as we want to re-use build artifacts across builds.

Conclusion

For now, I’ve decided to continue using Jenkins and automate the update process for Jenkins plugins and Jenkins itself.

I spent a lot of time trying to make Teamcity work, but I just can’t justify the turmoil given the lack of convenience.  The others are ruled out for reasons already described.  If/where I’ve misrepresented products, please let me know, and I’ll update the post.  Also, if you’ve read our requirements and know of another product I should consider, please also let me know.