Skip to content

Instantly share code, notes, and snippets.

@MassiGy
Last active March 9, 2025 21:18
Show Gist options
  • Save MassiGy/1d7ae6c37c39b5b98a2cb9f5ca3d0d7b to your computer and use it in GitHub Desktop.
Save MassiGy/1d7ae6c37c39b5b98a2cb9f5ca3d0d7b to your computer and use it in GitHub Desktop.

Understanding go modules.

Before Go 1.11, Go developers had only one choice on how and where to write Go code, they had to be working under $GOPATH. This path is usually under $HOME/go, and in there we can find these directories:

    $ ls $GOPATH
    - src/      pkg/        bin/

The `src/` folder will contain your source code and the downloaded source code using the go tool chain.
The `pkg/` folder will contain the downloaded modules and installed packages.
The `bin/` folder will contain the downloaded binaries.

So every developer back then had to write thier Go code under $GOPATH/src .

After Go 1.11, we have now the choice to either continue using $GOPATH, or use the new way which is Go modules.


Go Modules come with a new set of possibilities. Now go developers do not have to adhere to the $GOPATH/src to write their source code, they can write it anywhere they want. The only thing they have to do is to configure a go.mod file in the top level of their project, so as it becomes a proper go module.
The go.mod file will at least contain the module name and the version of the go toolchain it was setup with.
After Go 1.11, we've also got a new env variable which is `GO111MODULE` that can be set system wide to either `on|off|auto`. "On" implies that every go project developed on the current machine should be a go module, "off" means that we still want to develop under $GOPATH/src, and "auto" is for a dynamic setup. Usually the variable is unset which means that it is defaulted to "auto".
You can check that through this command: `$ go env`

Moreover, thanks to go.mod files, other developers can now use our module as a dependency for their project by importing it after we've published it to github or gitlab.


Once a package is imported using the go toolchain, go will update the go.mod file to register this new dependency for the project as well as all its transitive dependencies. Go will also generate checksums to make sure that even though the dependency name and version did not change, the content did not change as well ! This makes go toolchain able to garentee reproducable builds.


NOTE: the checksums are stored in an adjacent file named go.sum (for more readability)


The process goes as such :


Developer A:

  • Create a go module with a proper go.mod file.

  • Create a remote repo on github/gitlab.

  • Make sure that the module name in the go.mod file matches the remote repo path to which the module will be uploaded.

  • Add a tag to the go module to give it an initial version.

  • Upload this module to the remote repo ( do not forget to push the tag as well ).


Developer B:

  • Find the project that the developer A had uploaded.

  • View its go.mod file.

  • Grab the module name from it. (this should be the same as the repo path)

  • Go back to our terminal (developer B).

  • On the top level of the project, run $ go get <developer A's module path>


This will add developer A's module as a dependency to developer B's project.


This is the ideal workflow, but it does not work seemlessly out of the box every single time. For this concrete process, the hickups might occure when developer B runs the $ go get command. What if:

  • Developer A did not push the go.mod to the remote repo ( same with go.sum if any )

  • Developer A's project and developer B's project depend on the same third party dependency but require diffrent versions.

  • Developer A's project was archived/removed/taken down/tampered with.

  • Developer A works in a company and his/her project is private.

Let's tackle these issues one by one.


If developer A does not commit the go.mod and go.sum files to the remote repo: Then developer A either does not want to make his/her project available to others to be used as a dependency, or he/she is not following the go guidelines. In other words, if using your go project as a dependency by others does not bother you, then do commit your go.mod and go.sum files to your remote repo.


If developer A's project & developer B's project depend on the same third party dependency but require diffrent versions: In this scenario, go toolchain got you covered. The go toolchain since Go1.13, thanks to Russ Cox (github@rsc), now integrate vgo a command specifically designed for this purpose. Using its minimal version selection algorithm, vgo will resolve these dependency issues for you by itself, without compremising your reproducible builds and retro-compability.


A great lecture was given by Rus Cox in a gopherconf at Singapoor on this subject titled: "Opening Keynotes: Go with versions" - 2018.


So for this particular issue, Go toolchain got you covered. (most of the time)



If developer A's project was archived/removed/tampered with: For this scenario, we'll dive more on how go get works and especially how GOPROXY and GOSUMDB come in to play. To understand how these work, let's see what go get would do if GOPROXY and GOSUMDB are disabled.


Developer: I want to download a dependency for my project hosted on github without using a GOPROXY nor a GOSUMDB. Steps:

  • The developer will run the appropriate go get command.

  • Assuming that the dependency's repo is still there, go get will download the entire source code as a zip file.

  • Go get will then decompress it to parse the go.mod file.

  • Go get will download the zip files of all the transitive dependencies that are not available on the developer machine with the correct set of versions.

  • Go get will calc the dependencies's checksums and add them to the go.sum file.

  • Go get will then skip the checksum checks since we do not have any checksum db to check against on. ( so it is basically a first trust issue for all the developers that do the same )


This is not too bad if it is the first time that you run the go get command on your machine, and that you were lucky enough to hit a safe spot despite the first trust issue.


It is not bad if it is your first go get on your current machine since with this setup Go will need to download whole dependencies just to parse their go.mod files. On the first go get command ever, go will do it anyway since there is no cache on the machine, but doing this everytime will result in huge download times and delays.


Besids, each time a new develper go gets without a proxy and a sumdb, the first trust issue becomes a bigger issue. And assuming that every developer starts a fresh project with this behavior (no proxy and no sumdb), at the end, even if all the developers use the same set of dependencies with the same set of versions, it is not unlikely to see go generate different go.sum files for them, especialy if you have a big enough population of developers and a long enough period of time.


So with this scenario, we can now imagine how a checksumdb will help us prevent this first trust issue at the developer level.


With a checksumdb, we will have a centralized way of verifying the content of our downloads assuming that the centralized checksumdb has already registred a hash for the target set of downloads.


This way, the first trust issue goes from the developer level to the community level, meaning that as soon as one developer posts a checksum to the checksumdb (hopefuly this checksum reflects a sane content) all the community is now immuned to content tampering/changes, because every developer will now have access to a single source of truth.


So it goes from: "it is every developer's first trust issue", to: "it is only the first developer's first trust issue".


Because, it is only the first developer that will find him/her-self needing a non-exiting yet checksum to check against, in order to verify the content of what he/she just downloaded using the go toolchain. Hopefully, this first developer is the maintainer him/her-self doing some tests.


This is pretty much the role of gosumdb.


What about the proxy ? Where does it enter to play ?


Well the proxy will tackle the first set of issues encountered in our case-study, i.e the latency and the availability.


Latency:


Without a goproxy, go get will be forced to download whole zip files just to read the go.mod files, which is usually just a tiny file compared to the zip file size.


A go proxy is just an http server that provides a couple of endpoints that go get will target as a middle man between the developer's machine and the remote server, where the code is hosted.


These endpoints can be viewed using the command $ go help proxy


In a nutshell, a go proxy allows go get to quickly view the list of all available versions for a dependency. Once a version is selected, go get can also ask for the mod file directly from the proxy's behalf and all of that without any large downloads. So with a goproxy, Go toolchain can go through the whole dependency resolution process without any zip files getting downloaded, which is way faster then how it is done in our case-study (i.e without a goproxy).


Availability:


Most go proxies that are used out there are what we call mirrors. A mirror is just a proxy with some cache/storage. This allows the community to keep accessing a dependency through the mirror even though the maintainer decided to take it down.


So with this, we now understand how GOPROXY and GOSUMDB come into play for our go toolchain and eco-system.



If developer A works in a company and his/her project is private, but we still need to download it as a dependency: This scenario typically reflects what happens in a company, where two collegues want to collaborate. One of them wants to download a module developed by the other, but go get fails since the target is a private repo (within a private company).


The reason for that is simply because go get needs authenication to your company's gitlab or gtihub in order to access the shared modules. Otherwise, gitlab or github will reject the requests fired by go get.


Fortunately, this is easy to do, it just requires some git and netrc configuration. (I will write more about this in a seperate article)


Something worth noting is that typically in these scenarios we generally disable the go proxy and checksumdb, since most companies do not have these setup internally. Also, a few questions arise once these are setup, like:

  • Do you really want your proxy to be a mirror internally ? Because this will allow a developer to keep accessing a dependency even if another team decided to take it down! (This should be discussed internally).

As for an internal checksumdb, this becomes less important since your private dependencies are developed by your collegues, so no harm should come from them, and each content change can be discussed internally between each other.


Resources: (in order)

  • Free code camp: GOPATH vs Go modules.
  • Go class: 40 Go modules (ytb lecture given by Matt Holiday).
  • Gophercon: A life cycle of a query (ytb lecture by Kathie Hockman).
  • Gophercon: Go has versions - go & vgo - 2018 (ytb lecture by Russ Cox).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment