🚨 This is an in progress article, if you want the code, you can grab it here
If you ever released an Android app, you had to deal with versionName
and versionCode
. Every Android app has those 2 versions.
versionName
is really straightforward, it’s just a string that the user can see on the phone settings, something like "1"
, "2.3.4"
or "0.1-beta"
. You just put whatever it makes sense there. Usually a variation of the semantic version scheme that makes sense to the release schedule of the product.
versionCode
on the other hand is a positive 32-bit integer, it is the internal version used by the framework to identify if an APK is greater than the other. Android will not let you install an APK with with a versionCode
lower than the versionCode
of the same app that you have installed on the phone (no downgrades allowed).
That means that for every new app version that you ship, it you should have a greater versionCode
than the previous. A pretty easy strategy that complies with this requirement is to just start with version 1
and increment at every release. This might not be good enough for some projects, reasons range from having to support different APK variants of the same app to just wanting the versionCode
to be related with versionName
. For this, we need a versionCode
schema.
Append things together
Android docs have some examples on how to encode the versionCode
. Let’s imagine that we have an app following the semantic version scheme, if we want to release version "1.203.4"
, we could use 120304
as the versionCode
. We just concatenate all the numbers together and adding trailing zeroes where necessary, in this case, we use 04
instead of 4
because we might have patch numbers taking more than one digit, like 11
, in the future.
If we have multiple APK variants, let’s say we maintain different APKs for minSdk
21, 23 and 24, we could append those numbers at the start of our previous version, such as 21120304
and 23120304
.
Do some math
The previous example is actually a scenario taken from the documentation where they show a different way encoding version code, where we have a “base” version code and we add large numbers to it based on the product flavor. You might find the following code familiar:
android {
flavorDimensions += listOf("api", "mode")
productFlavors {
create("minApi24") {
dimension = "api"
minSdk = 24
// To ensure the target device receives the version of the app with
// the highest compatible API level, assign version codes in increasing
// value with API level.
versionCode = 30_000 + (android.defaultConfig.versionCode ?: 0)
versionNameSuffix = "-minApi24"
// ...
}
create("minApi23") {
dimension = "api"
minSdk = 23
versionCode = 20_000 + (android.defaultConfig.versionCode ?: 0)
versionNameSuffix = "-minApi23"
// ...
}
create("minApi21") {
dimension = "api"
minSdk = 21
versionCode = 10_000 + (android.defaultConfig.versionCode ?: 0)
versionNameSuffix = "-minApi21"
// ...
}
}
}
This code makes sure that if we have a base versionCode
of 100
, we would generate:
10_100
forminApi21
20_100
forminApi23
30_100
forminApi24
And basically having 10,000 versions reserved for each flavor. For our previous versioning scheme, those numbers would probably not be enough, since that with a base version of 120304
we would have:
130_304
forminApi21
140_304
forminApi23
150_304
forminApi24
While it looks like we still have 10,000 version for each flavor, our base versionCode
does not increment by one, remember? It follows the semantic version that we assign to our app (1.203.4), so, if we get to version 1.300.0, we would get version 140_000
for minApi21
, conflicting with a version reserved for minApi23
.
“1.200.0” and “1.300.0” are just 100 versions apart, so, for our version scheme, we would probably need to add 100.000, 200.000 and so on to get to safe ranges with this approach (assuming that the major version changes very slowly and we would not get to the version “10.0.0” in the lifetime of this app, or else we would need even larger numbers).
I do think the second encoding approach (adding numbers to a base version) is error prone, and while the first approach (appending numbers together) is simple and produces a version that can be easily parsed by looking at it, it uses unnecessary space. Let’s look at the patch version that we are using so far, we reserve 2 digits for the patch (04
in 120304
) because we might get into patch number 10 if things go very wrong, but assuming that we release a new minor version every two weeks, we will probably never get to patch number 99. While this seems like a no-problem, some teams use the release date as part of the version, making versionCode
very close to $2^{31}$ with the previous approaches (been there).
In case is not clear why I’m mentioning $2^{31}$ instead of $2^{32}$, it is because
versionCode
should be a positive integer, so we only have around half of the possible values in the 32-bit integer to work with
Do it like a hacker
A more efficient way of storing those complex versions on an integer, is to use the integer bits directly, just like a C programmer would do 💻.
Let’s look again at the patch version that was taking 2 decimal digits in 120304
, it may be safe to assume that there will be no more than 16 patches for a given version, but since it is taking 2 decimal digits, it has a valid range of 00–99, and that takes 7 bits on every versionCode
we create. We could instead reserve the last 4 bits and have 16 valid patch numbers for each version (0-15).
We can do that for every component in our version, choosing the number of bits necessary to encode all expected values of that component, until we use all the available 31 bits of the integer. Making sure we put the components of higher order in the high order bits compared to the lower order components, i.e., major goes to the higher order bits while patch goes to lower order ones.
Even if we have multiple APKs, like the previous example. We can map them to some small integer, in the previous example, we could map it like this
minApi21
maps to 0minApi23
maps to 1minApi24
maps to 2
This would only take 2 bits to encode, leaving space for one additional APK in the future. And we could turn our version into something like “2.1.203.4”, meaning: minApi24, major 1, minor 203, patch 4.
For the remaining part of this section, let’s just focus on the normal semantic version and let’s not think about the multiple APKs problem. Let’s also assume that we are dealing with a 16-bit integer instead of a 32 one. This will make the examples and diagrams easier to understand.
We can use 15 bits from this integer (remember, we can’t have negative numbers). Let’s use 3 bits for the major, 8 bits for the minor and 4 for the patch. This will give us the following range:
- $0 \leq \text{major} \lt 8$
- $0 \leq \text{minor} \lt 256$
- $0 \leq \text{patch} \lt 16$
Now let’s try to visualize the bits of this integer.
If we build the version “0.0.0”, is clear that all the bits will just be unset, like this:
Likewise, if we go to the max version “7.255.15”, all the bits should be set (apart from the sign bit):
And “1.2.3” would look like this:
The lost art of bit manipulation
OK, we can reserve some bits in the 32-bit Int
to encode our version in a nice way, but how do we do this? There is a small set of operations we can use that target the bits in an integer.
Given the vast amount of available memory and storage on modern devices (even for smartphones), some programmers are not used to do bit manipulations anymore, let’s try to build this and learn about some of those techniques along the way. Our goal will be to encode the semantic version scheme with the following specs:
- The major component will take 7 bits;
- The minor component will take 19 bits;
- The patch component will take 5 bits.
The naive approach
So, we have 3 components, each of them will require a given number of bits to encode and a bit only takes 2 possible values. We can maybe have arrays of booleans for each component, each boolean can encode a bit, and we just need to find a way to turn all of those “bits” into a final integer. Let’s give it a try:
class VersionCode(val major: Int, val minor: Int, val patch: Int) {
private val majorBits = Array(MAJOR_BITS) { idx ->
major.getComponentBit(MAJOR_BITS, idx)
}
private val minorBits = Array(MINOR_BITS) { idx ->
minor.getComponentBit(MINOR_BITS, idx)
}
private val patchBits = Array(PATCH_BITS) { idx ->
patch.getComponentBit(PATCH_BITS, idx)
}
private fun Int.getComponentBit(componentSize: Int, idx: Int): Boolean {
// ...
}
companion object {
private const val PATCH_BITS = 5
private const val MINOR_BITS = 19
private const val MAJOR_BITS = 7
}
}
The idea here, is that we will have a class that will take all the version components and store them into boolean arrays. The challenge arises when implementing getComponentBit
function, this function takes an integer and an index and should return the corresponding bit (true
for a 1 bit and false
for 0 bit) for that integer. For example, if we call the function like this: 4.getComponentBit(idx = 0)
it should return the most significant bit from the integer 4
; An index of 31
should return the least significant bit.
You might have noticed that the function also takes a componentSize
argument, that is the size, in bits, of a given component. That’s because we should interpret the major
32-bit integer as a 5-bit integer, so, when we request the bit of index 0, we are actually getting the bit of index 27 (because we ignore the 27 most significant bits).
Yeah, that extra componentSize
complicates things, as we have to work with 32-bit integer from the Kotlin language, but should interpret them as variable size integers. If we do some math we can extract another function that will only work with 32-bit integer and will not assume anything:
private fun Int.getComponentBit(componentSize: Int, idx: Int): Boolean {
return this[Int.SIZE_BITS - (componentSize - idx)]
}
private operator fun Int.get(index: Int): Boolean {
// ...
}
The way that this works is that the get
function actually returns the bit you want, so if I ask for the bit 0 (the syntax is 4[0]
), it will get me the bit 0, and getComponentBit
will translate the bit I meant to the bit I need. If I ask for 4.getComponentBit(componentSize = 5, idx = 0)
, is the same as 4[27]
; 4.getComponentBit(componentSize = 5, idx = 1)
is the same as 4[28]
, and so on.
Alright, there is no avoiding this now, how do we get the value for a given bit from an integer? We can still get away with not using bitwise operations by doing some math! You know how in base 10 we can do an integer division by 10 and the number “shifts right”? Something like this:
The $/$ operator below means an integer division
$$ 2731 / 10 = 273 $$
$$ 273 / 10 = 27 $$
$$ 27 / 10 = 2 $$
And at any point we can get the reminder of this division to get the value of the least significant digit, like in:
$$ 273 \bmod 10 = 3 $$
The same thing happens in base 2 with division by 2:
$$ 10110_2 / 10_2 = 1011_2 $$
$$ 1011_2 / 10_2 = 101_2 $$
And the remainder works the same as well:
$$ 1011_2 \bmod 10_2 = 1_2 $$
With this in mind, we can do the following:
- Shift the number right until the bit we want is the least significant bit
- Divide by 2 and get the remainder
- If the remainder is 1, return
true
, else, returnfalse
In Kotlin it would look something like this:
private operator fun Int.get(index: Int): Boolean {
var divisor = 1
repeat(Int.SIZE_BITS - index - 1) {
divisor *= 2
}
val bit = (this / divisor) % 2
return bit == 1
}
We can simplify this by implementing the exponentiation operation for integers (it is only defined for floating point numbers):
private operator fun Int.get(index: Int): Boolean {
val divisor = 2 toThe (Int.SIZE_BITS - index - 1)
val bit = (this / divisor) % 2
return bit == 1
}
private infix fun Int.toThe(exponent: Int): Int {
return toDouble().pow(exponent).toInt()
}
That’s cool! Now we have all the bits encoded on arrays, we just need to concatenate them together on a single array and interpret this array as a single 32-bit integer. It goes like this:
private val majorBits = Array(MAJOR_BITS) { idx ->
major.getComponentBit(MAJOR_BITS, idx)
}
private val minorBits = Array(MINOR_BITS) { idx ->
minor.getComponentBit(MINOR_BITS, idx)
}
private val patchBits = Array(PATCH_BITS) { idx ->
patch.getComponentBit(PATCH_BITS, idx)
}
// concatenate into a single array
private val bits = majorBits + minorBits + patchBits
val value: Int = bits.toInt()
private fun Array<Boolean>.toInt(): Int {
return this
.map { bool -> if (bool) 1 else 0 }
.foldIndexed(0) { idx, acc, bit ->
acc + bit * (2 toThe (lastIndex - idx))
}
}
In case you are not familiar with fold*
and reduce*
family of functions, this code is the same as:
private fun Array<Boolean>.toInt(): Int {
val intArray = this.map { bool -> if (bool) 1 else 0 }
var result = 0
intArray.forEachIndexed { idx, bit ->
result += bit * (2 toThe (lastIndex - idx))
}
return result
}
In both cases, the bit * (2 toThe (lastIndex - idx))
part is just calculating the contribution of a given bit to the final integer, where the leftmost bit corresponds to the highest power of 2.
Let’s run this through an example, assume that the input array is:
val bits = arrayOf(true, false, true, true) // Represents 1011 in binary
val value = bits.toInt() // Expected result: 11
The intArray
will be [1, 0, 1, 1]
. The conversion will work as follows:
$$ 1 \times 2^3 = 8 \\ 0 \times 2^2 = 0 \\ 1 \times 2^1 = 2 \\ 1 \times 2^0 = 1 $$
And the final sum is: $$8+0+2+1=11$$
Binary manipulation toolkit
Do you remember how we did some math to implement the get
function? Well, that not how real hackers do it. You see, most languages support a set of bitwise operations that execute at the individual bit level, they can be more performant and, for some things, easier to use. Let’s introduce a couple of them and rewrite the function.
There are two steps the get
function performs to get a given bit, shift the value right until the bit is the least significant, and extract the least significant (binary) digit. That is actually a binary operation called “shift right” that we can use here, we can use the function shr
. This function accepts the number of bits we need to shift right, so we don’t need to compute the power of 2 that we need to divide.
For example, if we wanted to shift a given number 4 bits to the right, we would need to perform the operation $n / 2^4$, with the shr
we just need to do $n\ \mathrm{shr} \ 4$.
So, the function would look like this:
private operator fun Int.get(index: Int): Boolean {
val bit = (this shr (Int.SIZE_BITS - index - 1)) % 2
return bit == 1
}
What about the mod operator at the end? There is also a bitwise technique in the toolbox for such a thing, we can create a mask and apply the bitwise and
operation.
Suppose we have an integer n
where we want to clear every bit except the bit at index 4
, we can create a mask (another integer value) where only the 4th bit is set. We then apply the and
operator between those two values. This operator applies a logical and operation (assuming 1
is true
and 0
is false
) between every corresponding bit of the two values, and returns the result.
So, if we and zero and any other number together, we get zero, because any value and false is equal to false. For example:
$$ 10110_2 \ \mathrm{and} \ 00000_2 = 00000_2 $$
On the other hand, if we apply the and operator to any value with a mask with all bits equal 1, we get the original value, since any value and true is equal to the first value. See:
$$ 10110_2 \ \mathrm{and} \ 11111_2 = 10110_2 $$
We can use this property to create a mask that will leave only one bit unchanged and clear the rest. So, for keeping the 4th bit unchanged, we do:
$$ 10110_2 \ \mathrm{and} \ 00010_2 = 00010_2 $$
Back to our get
function, since we always want to get the least significant bit, we can use 1
as the mask, so the function becomes:
private operator fun Int.get(index: Int): Boolean {
val bit = this shr (Int.SIZE_BITS - index - 1) and 1
return bit == 1
}
The final code without doing validations to make sure we don’t accept version components greater than the specified size is something like:
class VersionCode(val major: Int, val minor: Int, val patch: Int) {
private val majorBits = Array(MAJOR_BITS) { idx ->
major.getComponentBit(MAJOR_BITS, idx)
}
private val minorBits = Array(MINOR_BITS) { idx ->
minor.getComponentBit(MINOR_BITS, idx)
}
private val patchBits = Array(PATCH_BITS) { idx ->
patch.getComponentBit(PATCH_BITS, idx)
}
private fun Int.getComponentBit(componentSize: Int, idx: Int): Boolean {
return this[Int.SIZE_BITS - (componentSize - idx)]
}
private operator fun Int.get(index: Int): Boolean {
val bit = this shr (Int.SIZE_BITS - index - 1) and 1
return bit == 1
}
private val bits = majorBits + minorBits + patchBits
val value: Int = bits.toInt()
private fun Array<Boolean>.toInt(): Int {
return this
.map { bool -> if (bool) 1 else 0 }
.foldIndexed(0) { idx, acc, bit ->
acc + bit * (2 toThe (lastIndex - idx))
}
}
private infix fun Int.toThe(exponent: Int): Int {
return toDouble().pow(exponent).toInt()
}
companion object {
private const val PATCH_BITS = 5
private const val MINOR_BITS = 19
private const val MAJOR_BITS = 7
}
}
You can check the full code here.
The way of the hacker
You might have noticed that we are keeping a boolean array of size 31 to store all the bits for the final version code, but since we can transform all those booleans into a single Int
value at the end, what is stopping us from using a single Int
to hold all of those bits and get rid of the boolean arrays? That is exactly what we are going to do now.
What we need here, is to use the shl
function to shift major
, minor
and patch
left the correct amount of bits and add them together. In order to understand why we need to do this, consider a simpler example: Image that we want to encode the version 1.3.7
(1, 11, 111 in binary, respectively) and major
, minor
and patch
take 2, 3 and 4 bits, respectively. We would leave patch
unchanged, shift minor
4 bits left (so it leave space for patch
in the least significant bits) and shift major
7 bits left (that is 3 + 4, leaving space for both minor
and patch
on the least significant bits).
Doing this, none of the bits in each version component will overlap, and we can safely add them, it will look like this:
$$ (1_2 \ \mathrm{shl} \ 7) + (11_2 \ \mathrm{shl} \ 4) + (111_2 \ \mathrm{shl} \ 0) = 10000000_2 + 110000_2 + 111_2 = 10110111_2 $$
Let’s figure it out how much we need to shift left for each component, that can be easily done by adding up the size of every lower component, like this:
companion object {
private const val PATCH_BITS = 5
private const val MINOR_BITS = 19
private const val PATCH_SHIFT = 0
private const val MINOR_SHIFT = PATCH_SHIFT + PATCH_BITS
private const val MAJOR_SHIFT = MINOR_SHIFT + MINOR_BITS
}
Now we remove all of that array stuff and simply compute the result directly:
val value: Int = (major shl MAJOR_SHIFT) + (minor shl MINOR_SHIFT) + (patch shl PATCH_SHIFT)
But if you want to gain some cool points on the street, you should probably replace those additions with the bitwise or
:)
val value: Int = (major shl MAJOR_SHIFT) or (minor shl MINOR_SHIFT) or (patch shl PATCH_SHIFT)
If you remove all the runtime validation, the code is actually very compact:
class VersionCode(val major: Int, val minor: Int, val patch: Int) {
val value: Int = (major shl MAJOR_SHIFT) or (minor shl MINOR_SHIFT) or (patch shl PATCH_SHIFT)
companion object {
private const val PATCH_BITS = 5
private const val MINOR_BITS = 19
private const val PATCH_SHIFT = 0
private const val MINOR_SHIFT = PATCH_SHIFT + PATCH_BITS
private const val MAJOR_SHIFT = MINOR_SHIFT + MINOR_BITS
}
}
You can compare with the version with validations here.
The way of the engineer
IN PROGRESS…