A type is a collection of possible values. An integer can have values 0, 1, 2, 3, etc.; a boolean can have values true and false. We can imagine any type we like: for example, a HighFive type that allows the values "hi" or 5, but nothing else. It's not a string and it's not an integer; it's its own, separate type.
Type áááŻáááşáážáŹááźá áşáááŻááşááźáąáážáááąáŹ áááşáááŻá¸áĄá áŻáĄááąá¸áá áşááŻááźá áşáááşá integer áá áşááŻáśá¸áááş á, á, á, á áĄá áážááááşáˇáááşáááŻá¸ááťáŹá¸áážááááŻááşáááşá Boolean áá áşááŻáśá¸áááş true áážááşáˇ false áá°áááşáˇ áááşáááŻá¸áážá áşááťááŻá¸ááźá áşáááŻááşáááşá ááťá˝ááşáŻááşáááŻáˇáááşááťá˝ááşáŻááşáááŻáˇ áážá áşáááşáᏠType áá áşááŻáááŻá áááşáá°á¸ááźááşáˇáááŻááşáááşá áĽááᏠáááşáááŻá¸ "hi" áážááşáˇ 5 áá°áááşáˇáážá áşááťááŻá¸áážáĄá áĄááźáŹá¸áááşáááŻá¸ááťáŹá¸áááŻáá˝ááşáˇáááźáŻáááşáˇ Type áá áşááŻáááŻá áááşáá°á¸ááźááşáˇáááşáááŻááŤá ááŻáˇá áĄáááŻá፠type áááş string áá áşááŻáááŻááşáááᯠinteger áá áşááŻáááşá¸áááŻááşááąááááşá¸áááŻááşáááŻááşá ááŽá¸ááźáŹá¸ type áá áşááŻááźá áşááąáááşá
Statically typed languages constrain variables' types: the programming language might know, for example, that x is an Integer.
In that case, the programmer isn't allowed to say x = true
; that would be an invalid program.
The compiler will refuse to compile it, so we can't even run it.
Different static type systems have different expressive power, and no popular type system can actually express our HighFive type above (though many can express other, much more subtle ideas).
Statically typed language ááťáŹá¸áááş áááşá¸áááŻáˇá variable ááťáŹá¸á type ááťáŹá¸áááŻááááşá¸ááťáŻááşáááşá áĽááᏠvariable x áááş integer ááźá
áşááźáąáŹááşá¸ááᯠProgramming Language áĄááąáážááşáˇáááážááááŻááşáááşá áááŻáˇáĄáá˝ááş Program ááąá¸ááŹá¸áá°áááş x = true
áá°á ááąá¸ááŹá¸áá˝ááşáˇááážáááąá áááŻáááŻáˇááąá¸ááŹá¸ááŤá áááşá¸ Program áááş áážáŹá¸áá˝ááşá¸ááąáŹ Program áá
áşááŻááźá
áşááąááááˇáşáááşá Compiler ááááşá¸ compile ááŻááşááąá¸áááşááźááşá¸áááŻááááşáˇáááşááźá
áşááźáŽá¸ ááᯠprogram áĄáŹá¸ run áááŻáˇáááşááááşáááŻááşááťáąá
static type system áĄááťááŻá¸ááťááŻá¸áá˝ááş áá˝á˛ááźáŹá¸ááźáŹá¸áá¸ááąáŹááąáŹáşáá˝ážááşá¸á
á˝ááşá¸ááąáŹááşáááş(expressive power)ááťáŹá¸áážáááźááąáŹáşáááşá¸ áááşáááşáˇáá°ááááťáŹá¸ááąáŹ type system áážáĄáááşááąáŹáşááźá፠"HighFive" typeáááŻááąáŹáşáá˝ážááşá¸áááŻááşááźááşá¸ááážáááąá áĄááťááŻáˇáááąáŹáˇáááŻáááŻááááşáá˝áąáˇááąáŹáĄáá˝áąá¸áĄááąáŤáşááťáŹá¸ááᯠááąáŹáşáá˝ážááşá¸áááŻááşá
á˝ááşá¸áážáááźáááşá
Dynamically typed languages tag values with types: the language knows that 1 is an integer and that 2 is an integer, but it can't know that the variable x will always hold integers. The language runtime will check these tags at various points. If we try to add two values, it might check that they're both numbers, or strings, or arrays. Then it will add the values, concatenate them, or error, depending on the values' type tags.
Dynamically typed language ááťáŹá¸áááşáááşá¸ áááşáááŻá¸ááťáŹá¸áĄáŹá¸ type ááťáŹá¸áážááşáˇáá˝á˛áááşááźáááşá áĽááᏠlanguage áĄááąááźááşáˇ á áááş integer ááźá áşáááşá á áááş integer ááźá áşáááş á áááźááşáˇáááážááááşá áááŻáˇááŹáá˝ááş áááşá¸áĄááąáážááşáˇ variable x áááş integer áááşáááŻá¸ááťáŹá¸ááŹááŤáááşááąááááşáˇáááşáᯠááśááąááśá áááźáąáŹáááŻááşááąá language á runtime á áĄááťááŻá¸ááťááŻá¸ááąáŹ ááąááŹááťáŹá¸áá˝ááş áĄáááşá፠áá˝á˛áááşáážáŻááťáŹá¸ááᯠá á áşááąá¸ááááşá áĄáááşá ááťá˝ááşáŻááşáááŻáˇá áááşáááŻá¸áážá áşáááŻááąáŤááşá¸áááşáááŻááťážááş áááşá¸á áááŻáááşáááŻá¸áážá áşááŻáááş number ááťáŹá¸ááąááąáŹá string ááťáŹá¸ááąááąáŹá array ááťáŹá¸ááąááąáŹáᯠá á áşááąá¸áᏠáááşáááŻááşáᏠtype áĄáááŻááşáááşáááŻá¸ááťáŹá¸ááᯠááąáŤááşá¸ááźááşá¸á áá˝á˛áááşááźááşá¸ áááŻáˇáááŻááş error ááŻááşááąá¸ááźááşá¸á áááźááşáˇááźáŻááŻááşááąá¸áááąáááşá
Static langauges check a program's types at compile time without executing the program.
Any program whose types don't follow the type system's rules is rejected.
For example, most static languages will reject the expression "a" + 1
(with C being a notable exception that allows it).
The compiler knows that "a" is a string and 1 is an integer, and that +
only works when the left and right hand sides are of the same type, so it doesn't need to run the program to know that there's a problem.
Every expression in a statically typed language has a definite type that can be determined without executing the code.
Static langauge ááťáŹá¸áááş program áá
áşááŻááşá type ááťáŹá¸ááᯠprogram áĄáŹá¸áĄáážááşááááş execute ááŻááşá
ááŹááááŻáᲠcompile time áážáŹáááşá
á
áşááąá¸ááąá¸áááŻááşááźáááşá á
áá
áşááááşáážááşááŹá¸ááąáŹ typeá
ááşá¸ááťááşá¸ááťáŹá¸ááᯠááááŻááşááŹáááşáˇ áááşáááşáˇ program áááŻáááᯠá
áá
áşáááźááşá¸áááşáááşá áĽááᏠstatic langauge áĄááťáŹá¸á
áŻá "a" + 1
áá°ááąáŹááąáŹáşááźááťááşááᯠááźááşá¸áááşááźááááşáˇáááşááźá
áşáááşá(á¤áá˝ááş C language áááş áááşá¸áááŻáá˝ááşáˇááźáŻááąáŹ áážááşááŹá¸áá˝ááşáᏠááťá˝ááşá¸ááťááşáá
áşááŻááźá
áşáááşá)"a"áááş string áá
áşááŻááźá
áşááźáŽá¸ 1 áááş integer áá
áşááŻáśá¸ááźá
áşáááşááᯠcompiler ááŹá¸áááşáááşá áááŻáˇááźááş +
áááş áááşáá˛áááşááąáŹ áááşááŹáááşááŤáá°ááŽááąáŹ type ááźá
áşááž áĄááŻááşááŻááşáááşáááŻáááşá¸ áááşá¸áááŹá¸áááşááŹá¸áááşá áááŻáˇáĄáá˝ááş program ááᯠáĄáážááşááááş run á
ááŹááááŻáᲠáá°áˇáážáŹ ááźáááŹáážáááąáááşááᯠáááááŻááşááąáááşá statically typed language áá
áşááŻá ááąáŹáşááźááťááşáááŻááşá¸áá˝ááş ááŻááşááᯠáĄáážááşááááş execute ááŻááşá
ááŹááááŻáᲠáá˝ááşááŻááşáááŻááşáááşáˇ ááááťááąáŹ type áá
áşááŻáážáááąáááşá
Many statically typed languages require type declarations.
The Java function public int add(int x, int y)
takes two integers and returns a third integer.
Other statically typed languages can infer types automatically.
That same add function is written add x y = x + y
in Haskell.
We don't tell it the types, but it can infer them because it knows that +
only works on numbers, so x
and y
must be numbers, so the function add
must take two numbers as arguments.
This doesn't decrease the "staticness" of the type system; Haskell's type system is notoriously static, strict, and powerful, and is more so than Java's on all three fronts.
Statically typed language ááąáŹáşááąáŹáşááťáŹá¸ááťáŹá¸áá˝ááş type ááźáąááŹááźááşá¸áĄáŹá¸ áááźá
áşáááąáááŻáĄááşááťááş áĄááźá
áşáááşáážááşááąáˇáážááááşá Java function public int add(int x, int y)
áááş integer áážá
áşááŻáśá¸ áááşááśá áááá integer áá
áşááŻáśá¸áááŻááźááşááŻááşááąá¸ááąáááşá
áĄááťááŻáˇááąáŹ statically typed language ááťáŹá¸ááá° type ááťáŹá¸ááᯠáĄáááŻáĄááťáąáŹááşáá˝ááşááťááşáá°áááŻááşááźáááşá áĄáááşá፠Java function áááŻáááş Hskell ááźááşáˇ add x y = x + y
áá°áááąá¸áááŻááşáááşá ááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ áááşá¸ááᯠáááşáááşáˇ type ááťáŹá¸ááźá
áşáááş áᯠááźáąááŹááąá¸á
áááááŻááąááĄáááşáˇááźáąáŹááşáˇáááŻááąáŹáş Haskell compiler áááş +
áááş number ááťáŹá¸áážááşáˇáᏠáĄááŻááşááŻááşáááşááᯠááááźáŽá¸ááŹá¸ááźá
áşáááşá áááŻáˇááźáąáŹááşáˇ x
áážááşáˇ y
áááş number ááťáŹá¸ááŹááźá
áşáááąáááşá áááŻáˇááźáąáŹááşáˇ function add
áááş number áážá
áşááŻáśá¸ááᯠargument áĄááźá
áşáá°ááááşáᯠáá˝ááşááťááşáááŻááşááąáááşá áááŻáˇááźáąáŹááşáˇ type ááťáŹá¸ááᯠáááźáąááŹááźááşá¸áááş á
áá
áşáá
áşáᯠstatic ááźá
áşáááźá
áşááᯠááťážáąáŹáˇááťáááŻááşá
á˝ááşá¸ááážáááąá Haskell á type system áááş staticááźá
áşááźááşá¸, strictááźá
áşááźááşá¸, powerful ááźá
áşááźááşá¸áá°ááąáŹ ááŻáśá¸ááťááşáážáŹááŻáśá¸áá˝ááş Java áááşááŹáá˝ááşááźáąáŹááşá¸áá°ááááťáŹá¸ááąáááşá
Dynamically typed languages don't require type declarations, but also don't infer types. The types of variables aren't known at all until they have concrete values at runtime. For example, the Python function
def f(x, y):
return x + y
can add integers, concatenate strings, concatenate lists, etc., and we can't tell which will happen without running the program. Maybe f will be called with strings at one point and integers at another point, in which case x and y hold values of different types at different times. This is why we say that values in dynamic languages have types, but variables and functions don't. The value 1 is definitely an integer, but x and y above might be anything.
Dynamically typed language ááťáŹá¸áááş type ááźáąááŹááźááşá¸áĄáŹá¸ ááážááááźá áşáááŻáĄááşááťááşáĄááźá áş áááşáážááşááŹá¸ááźááşá¸ááážáááąá áááŻáˇáĄááźááş áááşá¸áááŻáˇáááş typeááťáŹá¸ááᯠááźááŻáááşáá˝ááşáá°ááźááşá¸áááşá¸ááážáááąá variable ááťáŹá¸á type ááťáŹá¸ááᯠruntime áá˝ááşáĄáážááşááááşáááşáááŻá¸ááťáŹá¸ áááŹáááşáˇáĄááťáááşááááŻááşáááşáá ááźááŻááááááŻááşááąá áĽáááŹáĄáŹá¸ááźááşáˇ python function
def f(x, y):
return x + y
áááş integer ááťáŹá¸ááᯠááąáŤááşá¸ááźááşá¸á string ááťáŹá¸áááŻááťáááşáááşááźááşá¸á list ááťáŹá¸ááᯠááťáááşáááşááźááşá¸ áĄá
áážááááźááşáˇ ááźáŻááŻááşáááŻááşáááşááááŻáˇááźááş program áááŻááááş run áááźááşáˇáᲠááŹááźá
áşááŹáááşáᯠááťá˝ááşáŻááşáááŻáˇ ááááşáˇáážááşá¸áááŻááşááąá áááşá¸ function f ááᯠáá
áşááťáááşááťáááşá string ááťáŹá¸ááźááşáˇ ááąáŤáşáááŻááşááááŻáˇ áĄááźáŹá¸áá
áşááťáááşá integer ááťáŹá¸ááźááşáˇááąáŤáşáááŻááşááąáááşáá¤áá˝ááş x
áážááşáˇ y
áááş áá˝á˛ááźáŹá¸ááąáŹáĄááťáááşááťáŹá¸á áá˝á˛ááźáŹá¸ááąáŹ áááşáááŻá¸ááťáŹá¸ áážááááŻááşááąáááşá áááŻáˇááźáąáŹááşáˇ dynamic langauge ááťáŹá¸á áááşáááŻá¸ááťáŹá¸áá˝ááş type áážáááźáŽá¸ variable áážááşáˇ function áá˝ááş type ááážááᯠááźáąáŹáááŻááşááąáááşá á¤áááąáŹáĄá áááşáááŻá¸ á áááş ááąááťáŹááąáŤááş integer áá
áşááŻáśá¸ááźá
áşááąáŹáşáááşá¸ variable x áážááşáˇ y áááş ááŹááááŻááźá
áşáááŻááşááąáááşá
Most dynamic languages will error at runtime when types are used in incorrect ways (JavaScript is a notable exception; it tries to return a value for any expression, even when that value in nonsensical).
When using dynamic languages, even a simple type error like "a" + 1
can occur in production.
Static languages will prevent many such problems, though the degree of prevention depends on the power of the type system.
Dynamic language ááťáŹá¸áá˝ááş type ááťáŹá¸ááᯠááážááşááááş áĄááŻáśá¸ááźáŻááááŤá runtime áá˝ááş error áááşááááşáˇáááşááźá
áşáááşá(á¤áá˝ááş Javascript áážáŹáážááşááŹá¸ááąáŹááşááąáŹááťá˝ááşá¸ááťááşáá
áşááŻááźá
áşáááşá áááşá¸áááş áááşáááşáˇááąáŹáşááźááťááşáĄáá˝ááşáááᯠáááşááťážáááşáĄáááášááŹááşáááşá¸áá˛áˇááąáŹ áááşáááŻá¸áááşááźá
áşááŤá
áą áááşáááŻá¸áá
áşááŻáᯠááźááşááŻááşááąá¸áááşááźááŻá¸á
áŹá¸ááąáˇáážááááşá)
Dynamic Language ááťáŹá¸áááŻáĄááŻáśá¸ááźáŻááŹáá˝ááş "a" + 1
áá˛áˇáááŻáˇááąáŹ áĄáá˝ááşáááŻá¸áážááşá¸ááąáŹ type error ááťááŻá¸áááşááťážááş ááááşáĄááŻáśá¸ááźáŻááąáŹ ááŻááşáá˝ááşááźáŻáśáááąáˇáážááááşááąáááşá Static language ááťáŹá¸ááá° áĄáááŻááŤááźáááŹááťáŹá¸ááᯠááŹáá˝ááşááąá¸áááŻááşááąáááşá áááŻáˇááŹáá˝ááş áááşááááşááťážáĄáá ááŹáá˝ááşááąá¸áááŻááşááᲠáá°ááąáŹáĄááťááşáááąáŹáˇ type system á á
á˝ááşá¸ááąáŹááşáááŻááşá
á˝ááşá¸áĄááąáŤáş áá°áááşááąáááşá
Static and dynamic languages have fundamentally different ideas about what it means for programs to be valid.
In a dynamic language, "a" + 1
is a valid program: it will begin execution, then throw an error at runtime.
However, in most static languages, "a" + 1
is not a program: it won't be compiled and it won't run.
It's invalid code, just as the random string of punctuation !&%^@*&%^@*
is invalid code.
This extra notion of validity and invalidity has no equivalent in dynamic languages.
Static áážááşáˇ dynamic language ááťáŹá¸áá˝ááş program áá
áşááŻááşááážááşáááşááźááşá¸ááᯠáĄáááášááŹááşáá˝ááşáˇáááŻááŹáá˝ááş áĄáááşá¸áĄááźá
áşááĄá
áá˝á˛ááźáŹá¸ááźáŹá¸ááŹá¸ááąáŹ ááśáá°ááťááşááťáŹá¸áážáááźáááşá dynamic language áá
áşááŻáá˝ááş "a" + 1
áááş áážááşáááşááąáŹ program ááźá
áşáááşá language áááş áááşá¸ááᯠexecute á
ááŻááşááźáŽá¸ááž runtime áá˝ááş error áááşááąááááˇáşáááşá áááŻáˇááŹáá˝ááş static langauge áĄááťáŹá¸á
áŻáá˝ááş "a" + 1
ááᯠprogram áá
áşááŻááşáĄááźá
áşááááşáážááşááąááááşá¸ááᯠcompile ááŻááşá áááááᯠrun áááŻáˇáááşá¸ááááąá áááşá¸áááş!&%^@*&%^@*
áá°áááşáˇ ááťáááşá¸ááąá¸ááŹá¸ááąáŹ ááŻááşááźááşááŻááşáááşááťáŹá¸áážááşáˇ áĄááŹá¸áá° áááŹá¸ááááşááąáŹ ááŻááşááŹááźá
áşááąáááşá áááŻáááŻáˇ áááŹá¸áááşááźááşá¸ ááááşááźááşá¸áá°ááąáŹ á
ááşá¸á
áŹá¸ááŻáśááťááŻá¸áááş dynamic langauge ááťáŹá¸áá˝ááş áááşáá°ááážáááąá
The terms "strong" and "weak" are extremely ambiguous. Here are some ways that the terms are used:
"strong"áážááşáˇ "weak" áá°ááąáŹ á ááŹá¸ááŻáśá¸ááťáŹá¸áááş áĄáá˝ááşáá˝ááááźá áşá áąááąáŹáĄááŻáśá¸áĄáážáŻááşá¸ááťáŹá¸ááźá áşáááşá áááşá¸á ááŹá¸ááŻáśá¸ááťáŹá¸ááᯠáĄáąáŹááşááŤáĄáááášááŹááşááťáŹá¸áĄáááŻááşá¸ ááŻáśá¸áážáŻááşá¸ááąáˇáážááááşá
-
Sometimes, "strong" means "static". That's easy enough, but it's better to say "static" instead because most of us agree on its definition.
-
áá áşááŤáá áşááś "strong" ááŻáááŻááťážááş "static" ááźá áşááźááşá¸áááŻáááŻáááŻáááşá á¤áááŻáˇáááŻááťážááşááŹá¸áááşááá˝ááşáááşá áááŻáˇááŹáá˝ááş "static" áá°ááąáŹ á ááŹá¸ááŻáśá¸ááᯠááŻáśá¸ááźááşá¸á áááŻááąáŹááşá¸ááąáááşá áĄáááşáˇááźáąáŹááşáˇ áááŻááąáŹáş "static" ááźá áşááźááşá¸áá°ááąáŹ áá˝ááşáˇáááŻááťááşááᯠáá°áĄááťáŹá¸áááŻáááąáŹáá°ááŽáááŻááşááąáŹááźáąáŹááşáˇááźá áşáááşá
-
Sometimes, "strong" means "doesn't convert between data types implicitly". For example, JavaScript allows us to say
"a" + 1
, which we might call "weak typing". But almost all languages provide some level of implicit conversion, allowing automatic integer-to-float conversion like1 + 1.1
. In practice, most people using "strong" in this way have drawn a line between "acceptable" and "unacceptable" conversions. There is no generally accepted line; they're all arbitrary and specific to the person's opinions. -
áá áşááŤáá áşááś "strong" ááŻáááŻááťážááş "data type áá áşááŻáážááşáˇáá áşáᯠááźáŹá¸ áĄáááŻáĄááťáąáŹááş ááźáąáŹááşá¸áá˛ááá áşáááŻááşááźááşá¸"áááŻáááŻáááŻáááşá áĽááᏠJavascript áááş ááťá˝ááşáŻááşáááŻáˇáĄáŹá¸
"a" + 1
áá°á ááąáŹáşááźáá˝ááşáˇááźáŻááŹá¸áááşááááşá¸ááᯠ"weak typing" áᯠááąáŤáşááźáááşá áááŻáˇááŹáá˝ááş langauge áĄáŹá¸ááŻáśá¸ááŽá¸ááŤá¸ áĄáááŻáĄááťáąáŹááş type ááźáąáŹááşá¸áá˛áážáŻááᯠáĄáááŻááşá¸áĄááŹáá áşááŻááááąáŹáˇ áá˝ááşáˇááźáŻááźáááşá áĽáááŹ1 + 1.1
áá˝ááş áĄáááŻáĄááťáąáŹááş integer-to-float ááźáąáŹááşá¸áá˛ááźááşá¸ááťááŻá¸ááźá áşáááşá áááşáá˝áąáˇáá˝ááş "strong" ááᯠá¤áááşáážááşááťááşááźááşáˇ áĄááŻáśá¸ááźáŻáá°áááŻááşá¸ááŽá¸ááŤá¸áá˝ááş áááşááśáááŻááşááąáŹ ááźáąáŹááşá¸áá˛áážáŻáážááşáˇ áááşáááśáááŻááşááąáŹááźáąáŹááşá¸áá˛áážáŻ á ááşá¸ áá áşááŻá ááąáŹáˇ áážáááźáááşáááşá áááŻá ááşá¸ áááş áĄááťáŹá¸á ᯠáááąáŹáááŻááşááŽáááşáˇ á ááşá¸ááťááŻá¸áááŻááşáᲠáááşáážááşáá°á áááşááźááşááťááşáĄáááŻááşá¸áááşááśááŹá¸ááąáŹ á ááşá¸ááťááŻá¸ááźá áşááąáááşá -
Sometimes, "strong" means that there's no way to escape the language's type rules.
-
áá áşááŤáá áşááś "strong" ááŻáááŻááťážááş langauge ááž áááşáážááşááŹá¸ááąáŹ type á ááşá¸ááťááşá¸ááťáŹá¸ááᯠááťáąáŹáşáá˝ážáŹá¸áááş áááşá¸áááşá¸ááážáááŻáááŻáááŻáááşá
-
Sometimes, "strong" means memory-safe. C is a notable example of a memory-unsafe language. If
xs
is an array of four numbers, C will happily allow code that doesxs[5]
orxs[1000]
, giving whatever value happens to be in the memory addresses after those used to storexs
. -
áá áşááŤáá áşááś"strong" ááŻáááŻááťážááş "memory-safe" ááźá áşááźááşá¸áááŻáááŻáááŻáááşá C áááş memory-safe áááźá áşááąáŹ áááşáážáŹá¸áááşáˇ language áá áşááŻááźá áşáááşááĄáááşá
xs
áááş ááśááŤááşááąá¸ááŻáśá¸ááŤááąáŹ array áá áşáᯠááźá áşáááşáááŻááŤá ááŻáˇá C áĄááąáážááşáˇxs[5]
áááŻáˇxs[1000]
áá°ááąáŹááŻááşááťááŻá¸ááᯠááťáąáŹáşááťáąáŹáşááźáŽá¸áá˝ááşáˇááźáŻááźáŽá¸xs
áĄááąáŹááşáá˝ááşáááşáááşááááşá¸áááşá¸ááŹá¸ááąáŹ memory áá˝ááşáážááááşáˇ áááşáááŻá¸ááťáŹá¸ááᯠááŻááşááąá¸áááŻááşáááşááźá áşáááşá
Let's stop here in the name of brevity. Here's where some languages fall on these metrics. As shown, only Haskell is consistently "strong" by all of these definitions. Most languages are ambiguous.
á áŹáážááşáááşá ááŻá¸áááźááşáˇ á¤ááťážáážááşáˇáááş áááşáážááşááťááşááťáŹá¸ááᯠáááşááŤáááşá áĄáąáŹááşá፠áááŹá¸áááş language áĄááťááŻáˇááᯠáĄáááşá፠áááşáážááşááťááş áááşáááŻá¸ááťáŹá¸ááźááşáˇáážááŻááşá¸áážááşááźááŹá¸áááşá ááźááşááááşáˇáĄáááŻááşá¸áááş Haskell áá áşááŻáᲠáᏠáĄáááşá፠áĄáááášááŹááşáĄáŹá¸ááŻáśá¸ áá˝ááş "strong" ááźá áşááąáááşáááťááşááąáŹ langauge ááťáŹá¸á áŻáážáŹ áá˝ááááźá áşá ááŹáá˝áąááťááşá¸ááźá áşáááşá
Language | Static? | Implicit Conversions? | Rules Enforced? | Memory-Safe? |
---|---|---|---|---|
C | Strong | Depends | Weak | Weak |
Java | Strong | Depends | Strong | Strong |
Haskell | Strong | Strong | Strong | Strong |
Python | Weak | Depends | Weak | Strong |
JavaScript | Weak | Weak | Weak | Strong |
(An entry of "Depends" in the "Implicit Conversions" column means that the strong/weak distinction depends on which conversions we consider acceptable.) ("Implicit Conversions" áááŻááşáá˝ááş "Depends" ááŻááąáŹáşááźááŹá¸áááşáážáŹ ááťá˝ááşáŻááşáááŻáˇá áááşááśáááŻááşá á˝ááşá¸áĄááąáŤáşáá°áááşá strong/weak áá˝á˛ááźáŹá¸áááşááŻáááŻáááŻáááşá)
Often, the terms "strong" and "weak" refer to unspecified combinations of the various definitions above, and other definitions not shown here. All of this confusion renders "strong" and "weak" effectively meaningless. When tempted to use the terms "strong" or "weak", it's better to simply describe the exact, concrete behavior in question. For example, we might say "JavaScript returns a value when we try to add a string to an integer, but Python throws an error". Then, we don't have to spend the effort to carefully agree on one of the many definitions of "strong"; or, worse, end up with an unresolved misunderstanding simply due to terminology. ááťáŹá¸ááąáŹáĄáŹá¸ááźááşáˇ "strong" áážááşáˇ "weak"áá°ááąáŹ á ááŹá¸ááŻáśá¸áážá áşááŻáśá¸áááş áĄáááşááŤáá˝ááşáˇáááŻááťááşááťáŹá¸áážááşáˇ áĄááźáŹá¸ááąáŹáşááźááŹá¸ááźááşá¸ááážáááąáŹ áá˝ááşáˇáááŻááťááşááťáŹá¸áĄáŹá¸ áááşáážááşááťááşááážááᲠáĄáááşááźáąááááŻáá˝á˛á ááşá¸á áŹá¸ááźáŽá¸ ááąáŤáşáááŻááŹá¸ááźááşá¸áᏠááźá áşáááşá áááŻáááŻáˇ áážáŻááşáá˝áąá¸ááąááźááşá¸ááááş "storng" áážááşáˇ "weak"áá°ááąáŹ áá˝á˛ááźáŹá¸ááťááşááᯠáĄáááášááŹááşáá˛áˇá áąááąáááşá áááŻáˇááźáąáŹááşáˇ "strong" áážááşáˇ "weak" áá°ááąáŹ á ááŹá¸ááŻáśá¸ááťáŹá¸ááźááşáˇ áá˝á˛ááźáŹá¸ááąáŹáˇáááşáááŻááťážááş áááŻáááŻáˇáá˝á˛ááźáŹá¸áááşáˇáĄá áŹá¸ ááááťááąááťáŹááąáŹ áááŻáĄááşááťááşááᯠáážááşá¸áážááşá¸ááąáŹáşááźááźáŽá¸ ááŻáśá¸ááźááşá¸ááááŻááąáŹááşá¸ááąáááşá áĽááᏠááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ "Javascript áááş string ááᯠinteger áážááşáˇááąáŤááşá¸ ááťážááş áááşáááŻá¸áá áşáᯠreturn ááźááşááźáŽá¸ python áááąáŹáˇ error ááŻááşááąá¸áááş" áᯠááźáąáŹáááŻááşáááşá á¤áááŻáˇáááŻááťážááş ááťááşáŻááşáááŻáˇáĄááąáážááşáˇ "strong" áážááşáˇ "weak" áááŻáˇááážá áşáááşáááąáŹáá°áááş áĄáááşá¸áĄáááş ááźááŻá¸áááşá¸áĄáŹá¸ááŻááşááąá ááŹááááŻááąá áááŻáˇáááŻááşááŤá áĄááąáŤáşáĄááąáŤáşáá˝á˛ááźáŹá¸ááźááşá¸ááźáąáŹááşáˇááźá áşááąáŹ ááŹá¸áááşáážáŻáá˝á˛ááźááşá¸ááťááŻá¸áážááşáˇ áĄááŻáśá¸áááşááąáááşá
Most uses of "strong" and "weak" on the web are vague and ill-defined value judgements: they're used to say that a language is "good" or "bad", with the judgement dressed up in technical jargon. As Chris Smith has written:
Strong typing: A type system that I like and feel comfortable with
Weak typing: A type system that worries me, or makes me feel uncomfortable
áĄááşááŹáááşááąáŤáşáá˝ááşáá˝áąáˇáááąáŹ "strong" áážááşáˇ "weak" áĄááŻáśá¸áĄáážáŻááşá¸ááťáŹá¸áážáŹ ááąáŹááşááŤá¸ááźáŽá¸ áááŻááááŻáá˝á˛áá°áááşáážááşááŹá¸ááąáŹ ááŻáśá¸áááşááťááşááťáŹá¸ááŹááźá áşáááşá áááşá¸á ááŹá¸ááŻáśá¸ááťáŹá¸ááᯠáĄááŻáśá¸ááťááźáŽá¸ langauge áá áşááŻá ááąáŹááşá¸ááźááşá¸ áááąáŹááşá¸ááźááşá¸ ááŻáśá¸ááźááşááźááşá¸áááş áááşá¸áááŹá ááŹá¸ááŻáśá¸ááťáŹá¸ááźááşáˇ áĄááąáá˝áśááźáŻáśááŹá¸ááąáŹ áĄááąááźáŻáśááŻáśá¸ááźááşááťááşááťáŹá¸áᏠááźá áşáááşá Chirs Smith ááąá¸ááŹá¸ááá˛áˇáááŻáˇáááş
Strong typingá áááťá˝ááşáŻááşáážá áşáááşáááąáŹááťááźáŽá¸ ááťá˝ááşáŻááşáĄáá˝ááş áĄáááşááźáąááąáŹ type system
Weak typingá áááťá˝ááşáŻááşááᯠá ááŻá¸ááááşáá°áááşá áąááąáŹ áááŻáˇáááŻááş ááťá˝ááşáŻááşááᯠáĄáááşáááźáąááźá áşá áąááąáŹ type system
Can we add static types to a dynamic language? In some cases, we can; in others, it's difficult or impossible.
ááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ static type ááťáŹá¸ááᯠdynamic language áá áşááŻááᯠáááşáˇáá˝ááşá¸áááŻááşááŤáááŹá¸á áĄááťááŻáˇááąááŹááťáŹá¸áá˝ááş áááşáˇáá˝ááşá¸áááŻááşááąáŹáşáááşá¸ áĄááťááŻáˇááąááŹááťáŹá¸áá˝ááşáá° áááźá áşáááŻááşáááąáŹááş áááşáá˛áááşá
The most obvious problem is eval
and similar dynamic language features.
Evaluating 1 + eval("2")
in Python gives us 3.
But what does 1 + eval(read_from_the_network())
give us?
It depends on what's on the network at runtime.
If we get an integer, that expression is fine; if we get a string, it's not.
We can't know what we'll get until we actually run, so we can't statically analyze the type.
áĄáááşáážáŹá¸ááŻáśá¸ááźáááŹáážáŹ eval
áážááşáˇ áĄááŹá¸áá° dynamic langauge ááŻááşááąáŹááşááťááşááťáŹá¸ááźá
áşáááşá
1 + eval("2")
áá°ááąáŹááąáŹáşááźááťááşáĄáŹá¸ python ááźááşáˇ evaluate ááŻááşááťážááş 3 áá°ááąáŹáĄááźáąáááŻááááşá áááŻáˇááąáŹáş 1 + eval(read_from_the_network())
áá°ááąáŹááąáŹáşááźááťááşáá˝ááş áááşáááşáˇáĄááźáąááááşáááşá¸á áááŻáĄááźáąáážáŹ runtime áá˝ááş network ááąáŤáşáá˝ááşááŹáážáááąááá˛áááŻáááşáˇáĄááąáŤáşáážáŽáááşááąáááşá áĄáááşá integer áá
áşááŻáśá¸ááá˛áˇááťážááş áááŻááááşáááş ááźáááŹááážáááąá áááŻáˇááŹáá˝ááş string ááá˛áˇááŤá error áááşááąáááşá áááŻááąáŹáşááźááťááşááᯠáĄáážááşááááş á run ááźááşáˇáᲠááŹáá˝ááşááŹáááşááᯠááťá˝ááşááşáááŻáˇ áááźáąáŹáááŻááşááąá áááŻáˇááźáąáŹááşáˇ static áááşá¸ááźááşáˇ áá˝ááşááťááşáááşáˇáážááşá¸á áááááŻááşááąá
The unsatisfying solution used in practice is to give eval()
the type Any, which is like Object in some OO languages or interface {}
in Go: it's the type that can have any value.
Values of type Any aren't constrained in any way, so this effectively removes the type system's ability to help us with code involving eval.
Languages with both eval
and a type system have to abandon type safety whenever eval
is used.
áááşáá˝áąáˇáá˝ááşááŻáśá¸ááąáˇáážáááąáŹ ááťáąáááşáá˝ááşááážááááşáˇ áĄááźáąáá
áşááŻáážáŹ eval()
á ááááşáĄáŹá¸ ááŹááááŻááźá
áşáááŻááşáááşáˇ Any type ááąá¸ááźááşá¸ááźá
áşáááşá Any type áááş OO language ááťáŹá¸áá˝ááş object type áážááşáˇáĄááŹá¸ááášááŹááşáá°ááźáŽá¸ Go áá˝ááş interface {}
áážááşáˇ áá°áááşáˇ áááşáááşáˇáááşáááŻá¸ááááŻááźá
áşáááŻááşááąáŹ type ááźá
áşáááşá Any type ááááşáááŻá¸ááťáŹá¸áááş ááťáŻááşááťááşááŹá¸áááşáˇáááşáááŻá¸ááážáááąáᏠáááşá¸áĄááťááşááááş type system á eval áážááşáˇáááşáááşá áá°ááŽáááŻááşáááşáˇ á
á˝ááşá¸áááşááᯠáááşááťááşááźáŽá¸ááŹá¸ ááźá
áşá
áąááąáááşá
eval
ááąáŹ type system ááąáŹ áĄááŻáśá¸ááźáŻááŹá¸áááşáˇ language áááŻááşá¸ áááş eval
áááŻáĄááŻáśá¸ááźáŻáááŻááşá¸ type system áááąá¸áááşáˇ type ááŻáśááźáŻáśáážáŻááᯠá
á˝ááşáˇáá˝ážááşááźáááąáááşá
Some languages have optional or gradual typing: they're dynamic by default, but allow some static annotations to be added. Python has added this feature recently; TypeScript is a superscript of JavaScript that adds optional types; Flow does static type analysis on regular old JavaScript code. These languages provide some benefits of static typing, but they can never provide the absolute guarantees of a truly static language. Some functions will be statically typed and some will be dynamically typed. The programmer always needs to be aware of (and wary of) the difference. áĄááťááŻáˇááąáŹ language ááťáŹá¸áá˝ááş potential typing áááŻáˇ gradual typing ááŻááąáŤáşáááşáˇ type system ááťááŻá¸áážááááşá áááşá¸á áá áşáážáŹ default áĄáŹá¸ááźááşáˇ dynamic ááźá áşááźáŽá¸ áĄááťááŻáˇááąáŹ static annotation ááťáŹá¸ááᯠáá˝ááşáˇááźáŻááąáŹá áá áşááťááŻá¸ááźá áşáááşá Python áá˝ááş ááááşáááźáŹááąá¸áááşá áááşá¸á áá áşáááşáˇáá˝ááşá¸áááŻááşáááşá Typescript áááş Javascript á superset áá áşááŻááźá áşááźáŽá¸ Javascript áá˝ááş optional type áááşáážááşááźááşá¸ááᯠáááˇáşáá˝ááşá¸ááŹá¸áááşá Flow áááş áááŻá¸áááŻá¸ááŹáááş Javascript code ááᯠstatic type analysis ááŻááşááąá¸áááşá áááŻlanguage ááťáŹá¸áááş static typing á áĄááťááŻáˇááąáŹ áĄáŹá¸ááŹááťááşááťáŹá¸ááᯠáááşááśááážáááąáŹáşáááşá¸ ááááşáˇ static langauge áá áşááŻá áĄááźá˝ááşá¸áá˛áˇ áĄáŹáááśááťááşááťááŻá¸ááąáŹáˇ ááąá¸á á˝ááşá¸áááŻááşááźááşá¸ááážáááąá áĄááťááŻáˇááąáŹ function ááťáŹá¸áááş static type ááźá áşááąááźáŽá¸ áĄááťááŻáˇáážáŹ dynamic type ááźá áşááąáĽáŽá¸áááşááŹááźá áşáááşá Program ááąá¸ááŹá¸áá°áááş áááŻáá˝á˛ááźáŹá¸áážáŻááᯠáĄááźá˛ááąáŤááşá¸áá˛áááşáˇááŹá¸áááŻáˇáááŻááąáááşá
When compiling statically typed code, syntax is checked first, as in any compiler. Types are checked second. This means that a static language will sometimes give us one syntax error, but fixing that error leads to 100 type errors. Fixing the syntax error didn't create those 100 type errors; the compiler was just unable to check the types until the syntax was fixed.
static type ááźááşáˇááąá¸ááŹá¸ááąáŹ ááŻááşááťáŹá¸ááᯠcompile ááŻááşááťážááş ááŹáááş compileráááŻááşá¸ááŻááşááąáŹááşáááşáˇáĄáááŻááşá¸ syntax áááŻáĄáááşá á áşááąá¸áááşá ááźáŽá¸ááž type ááťáŹá¸ááᯠá á áşááąá¸áááşá áááŻáááŻáááşáážáŹ static lanugage áá áşááŻáááş áá áşááŤáá áşááś áá˝ááş ááťá˝ááşáŻááşáááŻáˇáĄáŹá¸ syntax error áá áşááŻáááŻááąá¸ááźáŽá¸ ááᯠerrorááᯠááźááşáááŻááşááąáŹáˇ ááž ááąáŹááşáááş type error áĄáᯠááá ááąáŹááşáá˝ááşááŹááŹááťááŻá¸ááźáŻáśá áąáááŻááşáááşá áááŻáˇááŹáá˝ááş á¤áááŻáˇááźá áşááááşáážáŹ syntax error áááŻááźááşáááŻááşááźááşá¸á type error ááťáŹá¸áááşá áąáááşáááŻááşáᲠáá°áá syntax ááážááşááąá¸ááąáŹááźáąáŹááşáˇ compilerá type ááťáŹá¸ááᯠáá á áşááąá¸áááŻááşááąá¸ááźááşá¸ááŹááźá áşááąáááşá
Static language compilers can often generate faster code than dynamic languages. For example, if the compiler knows that the add function takes integers, it can use the CPU's native ADD instruction. A dynamic language would have to check the types at runtime, choosing one of many specific "add" functions depending on the types (is it adding integers, or floats, or is it concatenting strings, or maybe lists?); or, it might have to decide to throw an error if the types don't match. All of that checking takes time.
static langauge compiler ááťáŹá¸áááş ááťáŹá¸ááąáŹáĄáŹá¸ááźááşáˇ dynamic langauge ááťáŹá¸áááşáááŻáááŻááźááşáááşááąáŹááŻááşááᯠááŻááşááŻááşááąá¸áááŻááşááźáááşá áĽááᏠadd
function áááş integer ááťáŹá¸ááŹáááşááśáááşáᯠcompiler ááž ááááŹá¸ááŤá áááşá¸áĄááąáážááşáˇáááŻáááŻááźááşáááşááąáŹ CPU á native ADD instruction áááŻááŻáśá¸áááŻááşáááŻááşáááşá dynamic language áá
áşááŻáá˝ááşáá° runtime á type ááťáŹá¸á
á
áşááąá¸ááźááşá¸á á
á
áşááąá¸áááážáááŹááąáŹ type áĄááąáŤáşáá°áááşá áĄááťááŻá¸ááťááŻá¸ááąáŹ add
function ááťáŹá¸áááŻáá˝áąá¸ááťááşááźááşá¸(integer ááťáŹá¸ááᯠááąáŤááşá¸ááźááşá¸ááŹá¸á float ááťáŹá¸áááŻááąáŤááşá¸ááźááşá¸ááŹá¸á string ááťáŹá¸ááťáááşáááşááźááşá¸ááŹá¸ áááŻáˇáááŻááş list ááťáŹá¸ááŹá¸ á
áááźááşáˇáá˝áąá¸ááťááşááźááşá¸) áááŻáˇáááŻááş type ááťáŹá¸áá˝á˛ááťáąáŹáşááąáá˛áˇááťážááş error ááŻááşáá˝ážááşááźááşá¸á
áááźááşáˇ áĄááŻááşááťáŹá¸ááťáŹá¸ááŻááşáááąáááşá ááᯠáĄááŻááşááťáŹá¸áĄáá˝ááş áĄááťáááşáĄáááŻááşá¸áĄááŹáá
áşááŻáááşá¸ áááŻááąá¸áááąáááşá
Dynamic languages have tricks to speed this up, like just-in-time (JIT) compilers, where the code is recompiled at runtime after gathering information about the actual types used. However, no dynamic language can match the speed of carefully written static code in a language like Rust.
Dynamic langauge ááťáŹá¸áá˝ááş áááŻááá ášá ááᯠááźáąáážááşá¸áááşáˇ áááşá¸ááźáąáŹááşá¸ááťáŹá¸ááąáŹáˇáážááááşááĽááᏠJIT compiler ááťáŹá¸áá˝ááş ááŻááşááᯠruntime á áĄáážááşááááşáĄááŻáśá¸ááźáŻáááşáˇ typeááťáŹá¸ááᯠáááážáááźáŽá¸ ááž compile ááźááşááŻááş ááźááşá¸ááťááŻá¸ááźá áşáááşá áááşáááŻáˇáááşáááŻá áą Rust ááᯠlanguage áá áşááŻáá˝ááş á áá áşáááťáááŻáá áşá ááŻááş ááąá¸ááŹá¸ááŹá¸ááąáŹ static code ááťááŻá¸áááźááşáááşáážáŻáááŻááąáŹáˇ áááşáááşáˇ dynamic language áážáážáĽáşáááŻááşáááşáááŻááşááąá
Static type system advocates point out that without a type system, simple typing mistakes may result in failures in production. This is definitely true; anyone who has used a dynamic language has experienced it.
static type system áážá áşááźááŻááşáá°ááťáŹá¸ááąáŹááşááźááąáˇáážáááąáŹ áĄááťááşáá áşááťááşáážáŹ type system áá áşááŻááážáááťážááş áááŻá¸áážááşá¸ááąáŹáĄáážáŹá¸ááťáŹá¸áááş áááş production áá˝ááşáážáŹá¸áá˝ááşá¸áážáŻááťáŹá¸ ááźá áşá áąáááŻááşáááş áá°á ááźá áşáááşá áááŻáĄááťááşáááş ááááşáááşá¸áááşá¸áážááşáááşááąáŹ áĄááťááşááźá áşáááşá dynamic langauge áĄááŻáśá¸ááźáŻáá°á¸áá°áááŻááşá¸ áááŻáĄáá˝áąáˇáĄááźáŻáśááᯠááśá áŹá¸áá°á¸ááźáááşá
Dynamic language advocates point out that dynamic languages seem to be easier to program in.
This is definitely true for some kinds of code that we write occasionally, like code that uses eval
.
It's debatable for everyday code, and depends on the ill-defined term "easy".
Rich Hickey did a fantastic talk about the word "easy" and its relationship to the word "simple".
Watching that will show that it's not easy to use the word "easy" well.
Be wary of "easy".
Dyanmic langauge áážá
áşááźááŻááşáá°ááťáŹá¸á ááąáŹááşááźááąáˇáážááááşáážáŹ dynamic language ááťáŹá¸ááŻáśá¸á program ááąá¸ááŹá¸ááááşáážáŹ áááŻáááŻáá˝ááşáá°áááşáá°áááźá
áşáááşá áááŻáĄááťááşáááşáááşá¸ ááťá˝ááşáŻááşáááŻáˇááąá¸ááąáˇááąá¸ááážááááşáˇ áĄááťááŻáˇááąáŹ ááŻááşáĄááťááŻá¸áĄá
áŹá¸ááťáŹá¸áĄáá˝ááş áážááşáááşááąáŹ áĄááťááşááźá
áşáááşá áĽááᏠeval
áááŻáĄááŻáśá¸ááźáŻááąáŹááŻááşááťááŻá¸ááźá
áşáááşá ááŻááşáĄáŹá¸ááŻáśá¸áĄáá˝ááşáááŻáááťážááşáá° áá˝ááşáá°áááş áá°ááąáŹ áĄáááášááŹááşáá˝ááşáˇáááŻááťááşáĄááąáŤáşáá°áááşááąááááşáˇáááşá
Rich Hickey ááźáąáŹáá˛áˇáá°á¸ááąáŹ áá˝ááşáá°áážáŻáážááşáˇ áááŻá¸áážááşá¸áážáŻá áááşá
ááşááŻáśáááŻááąáŹáşááźáááşáˇ áĄáá˝ááşááąáŹááşá¸ááąáŹ [talk]((https://www.infoq.com/presentations/Simple-Made-Easy)áá
áşááŻáážááááşááááşá¸ááᯠááźááşáˇáážáŻááźááşá¸ááźááşáˇ áá˝ááşáá°áá°ááąáŹ ááąáŤááŹááááŻáĄáá˝ááşáááŻáśá¸áááŻááşááźáąáŹááşá¸ áááşááááááşáˇáááşá áá˝ááşáá° áá°ááąáŹá
ááŹá¸ááᯠáááááŹá¸áááŻáˇáááŻáááşá
The trade-offs between static and dynamic type systems are still poorly understood, but they definitely depend heavily on the language in question and the problem being solved. static áážááşáˇ dynamic langauge ááťáŹá¸ááźáŹá¸á áĄáŹá¸áááşá¸ááťááşáĄáŹá¸ááŹááťááşááťáŹá¸áážáŹ áááŻááťáááşáá ááąááťáŹááŹá¸áááşááźááąá¸ááźááşá¸ááážáááąá¸ááąá áááŻáˇááŹáá˝ááş áááŻáĄááťááşááťáŹá¸áážáŹáááşá¸ language áážááşáˇ ááźáąáážááşá¸áááŻáˇááźááŻá¸á áŹá¸ááąááąáŹ ááźáááŹáĄááąáŤáşáĄááťáŹá¸ááźáŽá¸áážáŽáááşááźááşáááşá
JavaScript tries to continue executing even if that means doing data conversions that are nonsensical (like "a" + 1
returning "a1").
Python, on the other hand, tends to be conservative and throw errors often, as it will in the case of "a" + 1
.
Those are very different approaches with different levels of safety, but Python and JavaScript are both dynamically typed.
Javascript áááş áĄáááášááŹááşáá˛áˇááąáŹ data áá°á¸ááźáąáŹááşá¸ááźááşá¸ááťáŹá¸ ááźáŻááŻááşáááťážááşááąáŹááş áááşáááş execute ááŻááşááźá˛ááŻááşáááşá (áĽááᏠ"a" + 1
ááž "a1"ááᯠreturn ááźááşááźááşá¸)
Python ááá° áĄáááşááŤááá
ášá
ááťááŻá¸áá˝ááş áážáąá¸áá°áááťááş error ááťáŹá¸ááŹááŻááşáá˝ážááşááąá¸áááŻáˇááźááŻá¸á
áŹá¸áááşááážá
áşááŻááŻáśá¸áááş dynamic language ááťáŹá¸áááşááźá
áşááąáŹáşáááşá¸ áá˝á˛ááźáŹá¸ááąáŹ ááŻáśááźáŻáśáážáŻáĄáááşáˇáĄááąáŤáşáá°áááşá áá˝á˛ááźáŹá¸ááąáŹ ááťááşá¸áááşáááşá¸ááťáŹá¸áĄááŻáśá¸ááźáŻááźááąáááşá
C will happily allow the programmer to read from arbitrary locations in memory, or to treat a value of one type as if it had another type, even if that makes no sense at all and leads to a crash. Haskell, on the other hand, won't even allow an integer and a float to be added together without an explicit conversion step added. C and Haskell are both statically typed, despite being wildly different.
C áááş programááąá¸áá°ááťáŹá¸ááᯠááťáááşá¸ memory ááąááŹááťáŹá¸ááž data áááşáááŻááşáá˝ááşáˇááťáąáŹáşááťáąáŹáşááźáŽá¸ááąá¸ááŹá¸áááşá áááŻáˇáĄááźááş type áá áşááťááŻá¸ááž áááşáááŻá¸áá áşááŻááᯠáĄááźáŹá¸ type áá áşááŻáá˛áˇáááąá¸ááŹá¸áááşá áááŻáááŻáˇáá˝ááşáˇááźáŻááźááşá¸áááş áĄáááášááŹááşáááşá¸áá˛áˇááźáŽá¸ program áááŻcrash áá˝áŹá¸á áąáááŻááşáááşáááŻáááşááťážááş áá˝ááşáˇááźáŻááŹá¸ááąáááşá Haskell ááá° integer áážááşáˇ float áááŻááąáŹááş ááááştype ááźáąáŹááşá¸áá˛ááźááşá¸ááážááᲠááąáŤááşá¸áááŻáˇáááĄáąáŹááş áááşáˇáááşááŹá¸áááşá áááŻáážá áşááťááŻá¸áááş statically typed langauge ááťáŹá¸ááźá áşáááş ááźá áşááąáŹáşááźáŹá¸áááşá¸ áá˝ááşá á˝áŹ áá˝áŹááźáŹá¸ááąáááşá
There are wide variations within dynamic languages and within static languages. Any blanket statement of the form "static languages are better at x than dynamic languages" is almost certainly nonsense. It may be true of some particular languages, but in that case it's better to say "Haskell is better at x than Python".
dyanamic langauge ááťáŹá¸áĄáá˝ááşá¸ áážááşáˇstatic langauge ááťáŹá¸áĄáá˝ááşá¸ááááş áá˝ááşá á˝áŹáá˝á˛ááźáŹá¸ááźáŹá¸ááŹá¸áážáŻááťáŹá¸ áážáááźáááşá "static language ááťáŹá¸áááş dynamic language ááťáŹá¸áááşááŻááşááąáŹááşááťááş x á ááŹáá˝ááşáááşáᯠáááŻááťážááş áĄáááášááŹááşááážááááąáŹááşáááşááźá áşáááşá áĄááťááŻáˇááąáŹ language ááťáŹá¸áĄáá˝ááş áážááşááąáŹááşá¸áážááşáááşá áááŻáˇááŹáá˝ááş áááŻáĄááŤááťááŻá¸áá˝ááşáááşá¸ "Haskell áááş Python áááşááŻááşááąáŹááşááťááş x áá˝ááşááŹáá˝ááşáááş"ááŻáááŻááźááşá¸ááᏠáááŻááąáŹááşá¸ááąáááşá
Consider two well-known statically typed languages: Go and Haskell. Go's type system lacks generics: types that are "parameterized" by other types. For example, we might create our own list type, MyList, that can contain any type of data that we need to store. We want to be able to create a MyList of integers, a MyList of strings, etc., without making any changes to the source code of MyList. The compiler should enforce those types: if we have a MyList of integers, but accidentally try to insert a string into it, then the compiler should reject the program.
áááşáážáŹá¸ááąáŹ statically typed language áážá áşááŻááźá áşááąáŹ Go áá˛áˇ Haskell áááŻáˇáááŻááźááˇáşááŤá Go átype system áá˝ááş generic type ááťáŹá¸ááážáááąá(generic type áááŻáááşáážáŹ áĄááźáŹá¸type áĄááťááŻá¸ááťááŻá¸ááźááşáˇ parameter ááąá¸áááŻáˇáááąáŹ type ááťááŻá¸áááŻáááŻáááŻáááşá) áĽááᏠááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ áááşáááˇáş data type áááŻááááŻáááşáˇáá˝ááşá¸áááŻááşáááşáˇ ááťá˝ááşáŻááşáááŻáˇ áááŻááşáááŻááş List type áá áşááŻááᯠáááşááŽá¸áááŻááşáááşá ááᯠType ááᯠMyList ááŻááąáŤáşáááşáááŻááŤá ááŻáˇáááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ áá°á MyList source code áááŻáááźáąáŹááşá¸áá˛áᲠinteger ááťáŹá¸áááşáˇáá˝ááşá¸áááŻááşááąáŹ MyListá stringááťáŹá¸áááşáˇáá˝ááşá¸áááŻááşááąáŹ MyList á áááźááşáˇ áĄááŻáśá¸ááźáŻááťááşáááşááźá áşáááşááááŻáˇáĄáá° compiler ááááşá¸ ááᯠtype ááťáŹá¸ááᯠááááááťááťáááşááśáááşáˇáááşá ááťá˝ááşáŻááşáááŻáˇáá˝ááş integer ááťáŹá¸áĄáá˝ááş MyList áá áşááŻáážáááźáŽá¸ áááşá¸ááᯠstring áá áşááŻáááşáˇáá˝ááşá¸áááş ááźááŻá¸á áŹá¸ááŤá compiler áĄááąáážááşáˇ ááᯠprogram ááᯠreject ááŻááşááááşááźá áşáááşá
Go was intentionally designed without the ability to define types like MyList. The best that we can do is to create a MyList of "empty interfaces": the MyList can hold objects, but the compiler simply doesn't know what their types are. When we retrieve objects from a MyList, we have to tell the compiler what its type is. If we say "I'm retrieving a string" but the actual value is an integer, we get a runtime error just like we would in a dynamic language.
Go ááᯠMyList áá˛áˇáááŻáˇááąáŹ type ááťáŹá¸áááşááŽá¸áááŻááşáááşáˇ á á˝ááşá¸áááşááťáŹá¸áááŤáááşá áąáááş ááááş áááşááŽá¸ááŹá¸ááźááşá¸ááźá áşáááşáááťá˝ááşáŻááşáááŻáˇáááşáááŻááşáááşáˇ áĄááąáŹááşá¸ááŻáśá¸ááŻááşááąáŹááşááťááşáážáŹ "empty interface" ááťáŹá¸áááşáˇáá˝ááşá¸áááŻááşáááşáˇ MyList áá áşááŻáááşááŽá¸ááźááşá¸ááźá áşáááşá áááşá¸ MyList áááş object ááťáŹá¸áááŻáááşááśáááŻááşáááşááźá áşáááşá áááŻáˇááŹáá˝ááş compileráĄááąáážááşáˇ ááᯠobject ááťáŹá¸á type ááᯠáááááááŻááşááąá ááᯠMyList áá áşáᯠááž object ááťáŹá¸ááŻááşáá°áááşáááŻááťážááş ááťá˝ááşáŻááşáááŻáˇá compiler áĄáŹá¸ áááşá¸áááŻáˇá type ááᯠááźáąáŹááááşááźá áşáááşááááŻáĄááŤáá˝ááş ááťá˝ááşáŻááşáááŻáˇ "string áá áşááŻááᯠááŻááşáá°áááş"áᯠááźáąáŹááźáŽá¸áᏠááž áĄáážááşááááşáááşáááŻá¸á integer ááźá áşááąááťážááş dynamic language áá áşááŻáá˝ááş ááźáŻáśááááşáˇ runtime error ááťááŻá¸áááŻááźáŻáśááááşááŹááźá áşááąáááşá
Go also lacks many other features present in modern static type systems (or even systems from the 1970s). Its designers have reasons for those decisions, though outsiders' opinions on them can be strong.
Go áá˝ááş ááąááşááąáŤáş type system ááťáŹá¸áá˝ááşáážááááşáˇ ááŻááşááąáŹááşááťááşááąáŹáşááąáŹáşááťáŹá¸ááťáŹá¸ áááŤáááşááąá(ááźáąáŹáááťážááş áááá ááźááşáˇáážá áşááťáŹá¸áĄáá˝ááşá¸ááááşááŽá¸áá˛áˇááąáŹ system ááťáŹá¸áááŻááşááąáŹááşááťááşááťáŹá¸ááąáŹááş áááŤáááşááą)á áĄááźááşáá°ááťáŹá¸ááž áĄááťááŻá¸ááťááŻá¸ áááşááźááşááŻáśá¸áááşáááŻááşááąáŹáşáááşá¸ Go ááᯠááŽáááŻááşá¸áá˝á˛áá°ááťáŹá¸áá˝ááş áááŻááŻáśá¸ááźááşááťááşááťáŹá¸áĄáá˝ááş áĄááźáąáŹááşá¸ááźááťááşááťáŹá¸áážááááşáážáŹáááşá¸ áĄáážááşáááşá
Now, let's compare Haskell, which has a very powerful type system.
If we define a MyList type, then the type of "list of ints" is simply MyList Integer
.
Haskell will now stop us from accidentally putting strings into our list, and it will ensure that we don't put an element of the list into a variable of type string.
ááᯠááťá˝ááşáŻááşáááŻáˇáĄáá˝ááşá
á˝ááşá¸áááşááźááşáˇááąáŹ type system áááŻáááŻááşáááŻááşááŹá¸áááşáˇ Haskell áážááşáˇ áážááŻááşá¸áážááşááźááˇáşááźáááşá Haskell áá˝ááş MyList type áááŻáááşáážááşáááşáááŻááťážááş integer ááťáŹá¸áĄáá˝ááş MyList ááᯠMyList Integer
áᯠType áááşáážááşááąá¸ááŻáśáááşáá¤áááŻáˇáááşáážááşááąá¸áááŻááşáááşáážááşáˇ ááᯠlist áĄáá˝ááşá¸áááŻáˇ string ááťáŹá¸áááˇáşáá˝ááşá¸ááźááşá¸ááᯠHaskell áááŹáá˝ááşááąá¸áááşááźá
áşáááşá áááŻáˇááźááş áááŻlist áĄáá˝ááşá¸ááž ááŻááşáá°áááŻááşááąáŹ áááşáááŻá¸áá
áşááŻááᯠstring variable áá
áşááŻáśá¸áá˝ááş áá˝áŹá¸ááááşá¸áááŻáˇááźááŻá¸á
áŹá¸ááźááşá¸ááťááŻá¸áááŻáááşá¸ááŹáá˝ááşááąá¸áááşááźá
áşáááşá
Haskell can also express far more complex ideas directly in types.
For example, Num a => MyList a
means "a MyList of values that are all the same type of number".
It might be a list of integers, or floats, or fixed-precision decimal numbers, but it will definitely never be a list of strings, as verified at compile time.
Haskell ááźááşáˇ áááŻááᯠáážáŻááşáá˝áąá¸ááąáŹáĄááźáśááŹááşááťáŹá¸áááŻáááşá¸ type ááťáŹá¸áááŻáááŻááşáááŻááşááŻáśá¸á ááąáŹáşááźáááŻááşáááşá áĽááᏠNm a => MyList a
ááŻáááŻááťážááş "number áážááşáˇáá°ááąáŹ type áážááááşáˇ áááşáááŻá¸ááťáŹá¸áĄáŹá¸ááŻáśá¸ áááşáˇáá˝ááşá¸áááŻááşáááşáˇ MyList" áᯠáĄáááášááŹááşááááşááááşá¸áááş integer list áááşá¸ááźá
áşáááŻááşáááşá float áááŻáˇ fixed-precision decimal number ááťáŹá¸áááşá¸ááźá
áşáááŻááşáááşááááŻáˇááŹáá˝ááş string list ááąáŹáˇáááşááąáŹáˇááž áááźá
áşáááŻááşááąá á¤áááşááᯠcompile time áážáŹáááş á
á
áşááąá¸ááźáŽá¸ááŹá¸ááźá
áşááąáááşá
We might write an add function that works on any numeric type.
That function will have the type Num a => (a -> a -> a)
.
This means:
ááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ áááşáááşáˇ ááááşá¸ááááşá¸ type ááźááşáˇááááŻáĄááŻááşááŻááşáááŻááşááąáŹ function áá
áşááŻááąá¸ááŹá¸áááŻááşáááşá
áááŻfunction áááş Num a => ( a -> a -> a )
áá°ááąáŹ type áážáááááşáˇáááşááááŻáááŻáááşáážáŹ
-
a
can be any type that's numeric (writtenNum a =>
). -
The function takes two arguments of type
a
and returns typea
(writtena -> a -> a
). -
a
áááş ááźááŻááşáááşáˇ ááááşá¸ááááşá¸type ááźá áşáááŻááşáááşá (Num a =>
ááŻááąá¸áááşá) -
ááąá¸ááŹá¸áááşáˇ function áááş type
a
áážááááşáˇ argument áážá áşááŻááá°á typea
áážááááşáˇ áááşáááŻá¸áá áşááŻááźááşááŻááşááąá¸áááşá (a -> a -> a
ááŻááąá¸áááşá)
One final example.
If a function's type is String -> String
, then it takes a string and returns a string; but if it's String -> IO String
, then it also does some IO.
That IO can be disk access, network access, reading from the terminal, etc.
ááąáŹááşááŻáśá¸áĽáááŹáá
áşááŻáááşááźáąáŹáááşá
áĄáááşá function áá
áşáᯠá type áááş String -> String
ááźá
áşááťážááş áááşá¸áááş string áá
áşááŻáááŻáá°á string áá
áşááŻááᯠááźááşááąá¸áááşááááŻáˇááŹáá˝ááş String -> IO -> String
ááŻáááŻáá˛áˇááťážááş IO ááŻááşááąáŹááşááťááşáá
áşááŻááŤááźáŻááŻááşáááşáááŻáááááŻááşáááşá
If a function does not have IO in its type, then we know that it doesn't do any IO. In a web application, for example, we can tell at a glance whether a function might change the database just by looking at its type.
áĄáááşá function á type áá˝ááş IO áᯠáááŤáá˛áˇááťážááş áááşá¸áááş IO ááŻááşááąáŹááşááťááşááźáŻááŻááşááźááşá¸ááážáááŻáááááŻááşáááşá áĽáááŹáĄáŹá¸ááźááşáˇ web application áá áşááŻáá˝ááş type ááťáŹá¸áááŻáá áşááťááşááźááˇáşááŻáśáážááşáˇ function áá áşááŻáááş database ááᯠááźáąáŹááşá¸áááŻááşáááźáąáŹááşá¸áááŻááş áááážááááŻááşááąáááşá
No dynamic languages and few static languages can do this; it's a particularity of the most powerful static languages. In most languages, we'd have to dig down into the function, and all of the functions that it calls, and so on, looking for anything that might change the database. The process would be tedious and error-prone, whereas Haskell's type system can answer this question easily and with certainty.
áááşáááˇáş dynamic/static language ááž á¤áááŻáˇáááŻááşááąáŹááşáááŻááşááťáąá á¤á á˝ááşá¸ááąáŹááşááťááşáááş á á˝ááşá¸áĄáŹá¸áĄááąáŹááşá¸ááŻáśá¸ static langauge ááťáŹá¸á ááááąááá áşááťááşááźá áşáááşá langauge áĄááťáŹá¸á áŻáá˝ááş ááťá˝ááşáŻááşáááŻáˇáĄááąáážááşáˇ function áááŻááşáĄáá˝ááşá¸áááŻáˇáááşáááşá¸á áááşá¸ááž ááąáŤáşáá°áááŻááşááąáŹ fuction áĄáŹá¸ááŻáśá¸áážááşáˇ áááŻfunction áá áşááŻá áŽáĄáá˝ááşáááşá¸ áĄááŹá¸áá° database áááŻááźáąáŹááşá¸áá˛áááŻááşáááşáˇááŻááşáážáááážáá á áşááąá¸ááááşááźá áşáááşá áááŻáááŻáˇá á áşááąá¸ááááşáááŻááťážááş áĄááťáááşááŻááşáá°áááşá¸ááźá áşááááşá á¤áááşááᯠhaskell type system ááĄáá˝ááşááá° ááááťááąááťáŹá á˝áŹ áĄááźáąááąá¸áááŻááşááąáááşá
Compare all of this power with Go, which can't express the simple idea of MyList, let alone "a function that takes two arguments, both of which are numeric and of the same type, and which then does some IO".
Go's approach does make it easier to write tools for programming with Go (most notably, the compiler can be simple), and it also results in fewer concepts to learn. The weighing of those benefits against Go's significant limitations is subjective. However, it's unquestionably true that Haskell is more difficult to learn than Go, that Haskell's type system is more powerful, and that Haskell can prevent far more types of bugs at compile time.
Go and Haskell are so different that lumping them together as "static languages" can be deceiving even though it's a correct use of the term. When comparing on practical safety benefits, Go is closer to dynamic languages than it is to Haskell. On the other hand, some dynamic languages are safer than some static languages (Python is generally considered to be much safer than C). When tempted to make generalizations about static languages or dynamic languages as a group, keep the huge variation between languages in mind.
With more powerful type systems, we can specify constraints at finer levels of granularity. Here are some examples, but don't get bogged down if the syntax doesn't make sense.
In Go, we can say "the add function takes two integers and returns an integer":
func add(x int, y int) int {
return x + y
}
In Haskell, we can say "the add function takes any type of numbers, and returns a number of that same type":
f :: Num a => a -> a -> a
add x y = x + y
In Idris, we can say "the add function takes two integers and returns an integer, but its first argument must be smaller than its second argument":
add : (x : Nat) -> (y : Nat) -> {auto smaller : LT x y} -> Nat
add x y = x + y
If we try to call this function as add 2 1
, where the first argument is larger than the second, then the compiler will reject the program at compile time.
It's impossible to write a program where the first argument to add is larger than the second.
Very few languages have this capability.
In most languages, this kind of check is done at runtime: we write something like if x >= y: raise SomeError()
.
Haskell has no equivalent of the Idris type above, and Go has no equivalent of either the Idris type or the Haskell type. As a result, Idris can prevent many bugs that Haskell can't, and Haskell can prevent many bugs that Go can't. In both cases, we need additional type system features, which make the language more complex.
Here's a very rough list of some languages' type system power in increasing order. This is meant to provide a rough sense of type system power and shouldn't be taken as an absolute. The languages grouped together here can be very different from each other. Each type system has its own quirks and most of them are very complex.
- C (1972), Go (2009): These aren't very powerful at all, with no support for generic types. We can't define a MyList type that can be used as a "list of ints", "list of strings", etc. Instead, it will be a "list of unspecified values". The programmer must manually say "this is a string" each time one is removed from the list, which may fail at runtime.
- Java (1995), C# (2000): These both have generic types, so we can say
MyList<String>
to get a list of strings, with the compiler being aware of and enforcing that constraint. Items retrieved from the list will have type String, enforced at compile time as usual, so runtime errors are less likely. - Haskell (1990), Rust (2010), Swift (2014): These all share several advanced features, including generics, algebraic data types (ADTs), and type classes or something similar (respectively: type classes, traits, and protocols). Rust and Swift have more mass appeal than Haskell, with strong organizations pushing them (Mozilla and Apple, respectively).
- Agda (2007), Idris (2011): These support dependent types, allowing types like "a function that takes an integer x, and an integer y, where y is greater than x". Even the "y is greater than x" constraint is enforced at compile time: at runtime, y will never be less than or equal to x, no matter what happens. Very subtle but important properties of systems can be verified statically in these languages. Only a small minority of programmers learn them, but some become very excited about them.
There's a clear trend toward more powerful systems over time, especially when judging by language popularity rather than languages simply existing. The notable exception is Go, which explains why many static language advocates shun it as a step backwards. Group two (Java and C#) are mainstream languages that are mature and widely used. Group three looks poised to enter the mainstream, with big backers like Mozilla (Rust) and Apple (Swift). Group four (Idris and Agda) are nowhere near mainstream use, but that may change eventually; group three wasn't anywhere near mainstream use just ten years ago.