Swiftwater Logo

The Curious Case of the Protocol Default

This entry is part 21 of 21 in the series Swiftwater

META: This is also published on Medium.

THE PROBLEM

I often write SDKs, and, when I do so, I like to follow a philosophy that I nickname “SQUID”.

One of the ways that I implement the “S” in “SQUID” (which stands for “Simplicity”), is to provide internal data structures as protocols, instead of classes or structs. It’s not a really big deal, but it does help to reduce the complexity and overhead of using the SDK.

As I was working on an SDK recently, I encountered a very strange issue. When I called methods, or referenced properties, in SDK classes, I got the default implementation behavior from the protocol definition; not the expected class implementation.

Whiskey Tango Foxtrot

Under the hood, the data structure was a class, but the presentation to the SDK user was as a protocol.

This protocol had an extension, in which I provided default implementations. I use this pattern frequently, in order to make protocol conformance “optional.” The user of the protocol doesn’t have to implement the entity (like a variable/property or method/function), as the default implementation “absorbs” the requirement. Like so:

protocol A {
    var thisIsRequired: String { get }
    var thisIsOptional: String { get }
}

extension A {
    var thisIsOptional: String { get { "" } set {} }
}

struct conformsToA: A {
    var thisIsRequired: String { "Struct conformsToA" }
}

In the above example, the conformant struct only needed to implement the “thisIsRequired” property, as the protocol extension took care of the “thisIsOptional” property.

Fair ’nuff. Seems pretty straightforward, eh?

There Ain’t No Such Thing As A Free Lunch (T.A.N.S.T.A.A.F.L.)

Protocols are really cool. They are one aspect of Swift that makes it an incredibly powerful language, but protocols are not classes.

That bears repeating: protocols are not classes.

They may smell like classes, because they offer a sort of “poor man’s hierarchy,” but they don’t have polymorphism.

A truly polymorphic class allows a “base class” to define a method or data member, only to have that co-opted by subclasses that derive from the base class. This is usually done via a mechanism called “vtables”. Swift uses this mechanism for classes, but not for structs, enums or protocols.

You can then treat the subclass as an instance of the base class, but when you call the method, or reference the overridden property, the subclass variant of the element is accessed.

You can’t derive from structs or enums. You can only extend them, which is a different mechanism.

When you conform to a protocol in Swift, you use almost exactly the same syntax as subclassing:

protocol PA {    // Defining Protocol PA

struct SA: PA {    // Conforming to Protocol PA

class CA: PA {    // Conforming to Protocol PA

class CB: CA {    // Deriving From (Subclassing) Class CA

This can easily lull us into thinking of conforming to protocols as “the same as” deriving from a class (especially when we are using default implementations in our protocols). That’s a big reason that Apple is so insistent on using the correct language, when discussing the use of protocols.

But under the hood, things are quite different.

When we implement a protocol-defined property/method, we are replacing the default implementation; not overriding it.

With a subclass, you can always use the “super” keyword to access the properties and methods of the class from which your subclass derives.

You can’t do that with protocols. Once you replace the default implementation; it’s gone for your instance.

That means there’s no vtable for protocols. All a protocol is, is a contract. It promises that the instance that conforms to the protocol will present a certain interface.

The default implementation is really just a way to “add value” to a protocol, so that conformance doesn’t have to be too onerous, and it reduces code duplication.

In that respect, protocols with default implementations more closely resemble PHP Traits, than classes (I think I just earned the enmity of an untold number of people by comparing Swift to PHP).

But What Does That Have to Do With Our problem?

Glad you asked. I decided to do some experimenting, and ended up using the following structure (and included playground) for testing:

UML Diagram for My Experiment. Purple is Protocol. Dotted Line means “Conforms To” or “Extends.”

In the above diagram, I have a protocol (PA) that defines a property as required for conformance (myName).

There is one struct (SA) and one class (CA) that conform to that protocol. Because myName is required, they each also define an instance property of myName.

Since CA is a class, we can derive from it (CB), and that subclass overrides the myName property.

We have a second protocol (PB), that extends protocol PA, and adds a default implementation of myName. This effectively makes myName an “optional” property. Conformant entities don’t need to implement it.

We define two structs (SB and SC) that conform to PB, but SB does not define the myName property; instead, relying on the default implementation defined by the PB extension.

We also define two classes (CC and CF) that conform to PB. As before, CC relies on the default implementation of myName, from PB.

We define two classes that derive from these classes, CD and CG. CD defines a first conformant implementation for the myName property. CG overrides the property that was defined in CF.

Then we have one more class that overrides its superclass: CE. CE overrides the CD implementation of myName.

In the playground below, you will see that I defined a few functions to print the values of myName.

If you read the playground, you will see that a straightforward, direct printing of the various structs and classes should result in something along these lines:

 ENTITY    SHOULD PRINT
Struct A    "Struct A"
Class A     "Class A"
Class B     "Class B"
Struct B    "Protocol B"
Struct C    "Struct C"
Class C     "Protocol B"
Class D     "Class D"
Class E     "Class E"
Class F     "Class F"
Class G     "Class G"

Note the expected printouts for Struct B (SB) and Class C (CC). These don’t define their own implementations of myName, so they fall back on the default implementation provided by Protocol B (PB).

These should be the only two instances where we see the String “Protocol B” printed.

Throw The Switch, Igor!

Now, let’s run the playground, and see what printouts we actually get:

PART ONE
	Direct Print:
		Struct A: Struct A
		Class A: Class A
		Class B: Class B
	printAsProtoclA(_: PA):
		Struct A: Struct A
		Class A: Class A
		Class B: Class B
	printAsStructA(_: SA):
		Struct A: Struct A
	printAsClassA(_: SA):
		Class A: Class A
		Class B: Class B
PART TWO
	Direct Print:
		Struct B: Protocol B
		Struct C: Struct C
		Class C: Protocol B
		Class D: Class D
		Class E: Class E
		Class F: Class F
		Class G: Class G
	printAsProtoclA(_: PA):
		Struct B: Protocol B
		Struct C: Struct C
		Class C: Protocol B
		Class D: Protocol B
		Class E: Protocol B
		Class F: Class F
		Class G: Class G
	printAsProtoclB(_: PB):
		Struct B: Protocol B
		Struct C: Struct C
		Class C: Protocol B
		Class D: Protocol B
		Class E: Protocol B
		Class F: Class F
		Class G: Class G
	printAsStructB(_: SB):
		Struct B: Protocol B
	printAsStructC(_: SC):
		Struct C: Struct C
	printAsClassC(_: CC):
		Class C: Protocol B
		Class D: Protocol B
		Class E: Protocol B
	printAsClassD(_: CD):
		Class D: Class D
		Class E: Class E
	printAsClassF(_: SF):
		Class F: Class F
		Class G: Class G

Oh, dear. We seem to have some erroneous printouts of “Protocol B” in Part Two (Part One looks fine).

Specifically, Class D (CD) and Class E (CE) are problematic. What do they have in common?

Well, CE derives directly from CD, which derives directly from CC. They are all in the same “family.” We expect CC to print “Protocol B,” but CD and CE should print “Class D” and “Class E,” respectively.

If we look up at the UML diagram, we see that CC did not define an instance of myName. Instead, it relied on the default implementation from Protocol B.

Now, note that this worked fine:

	printAsClassD(_: CD):
		Class D: Class D
		Class E: Class E

But this did not:

	printAsClassC(_: CC):
		Class C: Protocol B
		Class D: Protocol B
		Class E: Protocol B

The difference between the two, is where the hierarchy started, with the argument passed into the function that did the printing.

It seems that we can’t expect a vtable, if we don’t start with a class that defines or overrides a virtual method.

If we didn’t have the protocol conformance (and default implementation), then this would be a syntax error:

let name = classInstanceCC.myName

That’s because CC never defined the myName property. myName wasn’t explicitly defined in the hierarchy until CD. The only reason we could get away with that statement, was because of the default implementation provided by Protocol B.

This Is A “Loophole”

No language is perfect, but (in my opinion), Swift comes fairly close. This is a rather benign example of a weakness. This issue is a fairly unavoidable by-product of having protocol default implementations.

I’d rather have default implementations, than a syntax error when we try to use an overridden method that was defined in a default implementation.

CONCLUSION

We learned that Swift has a “loophole,” where unexpected behavior can happen, in a fairly “edge” case, where we override methods that are defined in a protocol default implementation, then access those overrides before their virtual implementation.

It’s a bit weird, but now that we know what to look for, it’s easy to avoid.

UPDATE

I was directed towards this Apple bug report. It appears as if it is the cause of this issue.

SAMPLE PLAYGROUND

// This is the little demo, at the start.
protocol A {
    var thisIsRequired: String { get }
    var thisIsOptional: String { get }
}

extension A {
    var thisIsOptional: String { get { "" } set {} }
}

struct conformsToA: A {
    var thisIsRequired: String { "Struct conformsToA" }
}

// In the first protocol, we don't use a default implementation (required conformance).
protocol PA {
    var myName: String { get }
}

struct SA: PA {
    var myName: String = "Struct A"
}

class CA: PA {
    var myName: String = "Class A"
}

class CB: CA {
    override var myName: String { get { "Class B" } set { } }
}

// In the second protocol, we add a default implementation (optional conformance).
protocol PB: PA { }

extension PB {
    var myName: String { "Protocol B" }
}

// This will "fall back" to the default implementation.
struct SB: PB { }

struct SC: PB {
    var myName: String = "Struct C"
}

// This will "fall back" to the default implementation.
class CC: PB { }

class CD: CC {
    var myName: String = "Class D"
}

class CE: CD {
    override var myName: String { get { "Class E" } set { } }
}

class CF: PB {
    var myName: String = "Class F"
}

class CG: CF {
    override var myName: String { get { "Class G" } set { } }
}

let structInstanceSA = SA()
let classInstanceCA = CA()
let classInstanceCB = CB()

// Part one is all about the required conformance protocol.
print("PART ONE")
print("\tDirect Print:")
print("\t\tStruct A: \(structInstanceSA.myName)")
print("\t\tClass A: \(classInstanceCA.myName)")
print("\t\tClass B: \(classInstanceCB.myName)")

func printAsProtoclA(_ inItem: PA) -> String { inItem.myName }

print("\tprintAsProtoclA(_: PA):")
print("\t\tStruct A: \(printAsProtoclA(structInstanceSA))")
print("\t\tClass A: \(printAsProtoclA(classInstanceCA))")
print("\t\tClass B: \(printAsProtoclA(classInstanceCB))")

func printAsStructA(_ inItem: SA) -> String { inItem.myName }

print("\tprintAsStructA(_: SA):")
print("\t\tStruct A: \(printAsStructA(structInstanceSA))")

func printAsClassA(_ inItem: CA) -> String { inItem.myName }

print("\tprintAsClassA(_: SA):")
print("\t\tClass A: \(printAsClassA(classInstanceCA))")
print("\t\tClass B: \(printAsClassA(classInstanceCB))")

let structInstanceSB = SB()
let structInstanceSC = SC()
let classInstanceCC = CC()
let classInstanceCD = CD()
let classInstanceCE = CE()
let classInstanceCF = CF()
let classInstanceCG = CG()

// Part two is where things get interesting.
print("PART TWO")
print("\tDirect Print:")
print("\t\tStruct B: \(structInstanceSB.myName)")
print("\t\tStruct C: \(structInstanceSC.myName)")
print("\t\tClass C: \(classInstanceCC.myName)")
print("\t\tClass D: \(classInstanceCD.myName)")
print("\t\tClass E: \(classInstanceCE.myName)")
print("\t\tClass F: \(classInstanceCF.myName)")
print("\t\tClass G: \(classInstanceCG.myName)")

func printAsProtoclB(_ inItem: PB) -> String { inItem.myName }

print("\tprintAsProtoclA(_: PA):")
print("\t\tStruct B: \(printAsProtoclA(structInstanceSB))")
print("\t\tStruct C: \(printAsProtoclA(structInstanceSC))")
print("\t\tClass C: \(printAsProtoclA(classInstanceCC))")
print("\t\tClass D: \(printAsProtoclA(classInstanceCD))")
print("\t\tClass E: \(printAsProtoclA(classInstanceCE))")
print("\t\tClass F: \(printAsProtoclA(classInstanceCF))")
print("\t\tClass G: \(printAsProtoclA(classInstanceCG))")

print("\tprintAsProtoclB(_: PB):")
print("\t\tStruct B: \(printAsProtoclA(structInstanceSB))")
print("\t\tStruct C: \(printAsProtoclA(structInstanceSC))")
print("\t\tClass C: \(printAsProtoclA(classInstanceCC))")
print("\t\tClass D: \(printAsProtoclA(classInstanceCD))")
print("\t\tClass E: \(printAsProtoclA(classInstanceCE))")
print("\t\tClass F: \(printAsProtoclA(classInstanceCF))")
print("\t\tClass G: \(printAsProtoclA(classInstanceCG))")

func printAsStructB(_ inItem: SB) -> String { inItem.myName }

print("\tprintAsStructB(_: SB):")
print("\t\tStruct B: \(printAsStructB(structInstanceSB))")

func printAsStructC(_ inItem: SC) -> String { inItem.myName }

print("\tprintAsStructC(_: SC):")
print("\t\tStruct C: \(printAsStructC(structInstanceSC))")

func printAsClassC(_ inItem: CC) -> String { inItem.myName }

print("\tprintAsClassC(_: CC):")
print("\t\tClass C: \(printAsClassC(classInstanceCC))")
print("\t\tClass D: \(printAsClassC(classInstanceCD))")
print("\t\tClass E: \(printAsClassC(classInstanceCE))")

func printAsClassD(_ inItem: CD) -> String { inItem.myName }

print("\tprintAsClassD(_: CD):")
print("\t\tClass D: \(printAsClassD(classInstanceCD))")
print("\t\tClass E: \(printAsClassD(classInstanceCE))")

func printAsClassF(_ inItem: CF) -> String { inItem.myName }

print("\tprintAsClassF(_: SF):")
print("\t\tClass F: \(printAsClassF(classInstanceCF))")
print("\t\tClass G: \(printAsClassF(classInstanceCG))")

// In a perfect world, this would be a syntax error:
let name = classInstanceCC.myName