May 27, 2014 18:14
diff --git a/type.txt b/type.txt
 The core of what the system is about is:

 - Having rough/flexible data strutures
 - Performing transformations on those data structures
 - Finding the core "must be correct" structures we want to reason about
 - Defining and naming those structures with relative precision,
  hardening them into definite declarations
 - Validating and verifying that input and output conform to those
  now-hardened structures.

 As the system started developing, it was all about pervasively flexible
 maps, of varying degrees of requirements for the definitions of what was
 in each map.

 Now that we've got a somewhat-maturing system, that we want to be able
 to apply rather large scale restructuring on, and we want to have
 relatively reliable rule-based reasoning about it, we're coming up on
 the problem of defining and validating those core data structures
 and their transformations.

 We have a couple of tools to fulfill these roles:

 - maps are the basic, flexible, embryonic structure that are
  fast and flexible to work with

 - records that start to reify dynamically tuped tuples
  with definite field sets that support abstract protocols

 - prismatic/schema describes compound data structures at runtime
  and allows you to define functions and validators in terms
  of those structures.  It is not "pervasive," in the sense of,
  you don't need to annotate an entire namespace, and the validation
  is not always active, and the validation is always in terms of
  "does this one given value to conform to the schema?"

  Another way to think about this is as a specific type of
  runtime contracts library, where the contracts are structural.

 - core.typed describes values and functions at compile time.  It
  is pervasive in the senses that a) you need to annotate an entire
  namespace at a time, and its dependencies, b) the reasoning that
  core.typed attempts to do is, like traditional static type checking,
  attempting to reason _for all possible values of the given types_.
  This provides stronger guarantees earlier, but in practice is
  harder to build out, especially if your namespace is changing, or it
  deals with flexible manipulations of complex maps.

 In our system, we're applying these tools to:

 - the backend record structures, which have been around for a little
  while
 - the web API, which we want both:
  - validation that incoming EDN to the API is well structured
  - testing that outgoing EDN is well structured
 - and in general, that functional transformations we expect are indeed
  still happening how we want them, even in the face of code change
  (e.g. compile-time checking and test suite)

 We've previously been accumulating a disorganized mishmash of concepts
 and usage.  This commit starts to rectify the situation:

 1. It roughly reconciles the prismatic/schema entries with the core
  defrecord entries in the `report` namespace.  schema provides
  a macro to literally define both in the same declaration.

 2. It attempts to do the data transformation to/from the web API in
   terms of these records and schemas

 3. It starts adding some basic tests asserting that the transformation
   functions work in terms of these structures.

 This should give us a much stronger foundation to start standing on in
 terms of solid data structures and data validation.

 Futhermore, a few hard-won understandings came out of the process:

 1. Nothing is more consise or more flexible than Clojure's built-in
   map support.  Both schema and core.typed add non-trivial structural
   scaffolding that is, while useful for testing things are still
   the way you want them, time-consuming and annoying to lay down for
   everything, *especially* if you are trying to rapidly experiment
   with, try out, move things around.

   Everything should begin as maps, and persist that way for quite
   a while, until you're *sure* that you want to solidify a structure.

 2. schema is nontrivially easier and faster to use than core.typed,
   and more flexible for common cases (e.g. structuring plain maps).
   It is also, I think, more clearly written and documented.  It also
   allows for anonymous schemas, which can come in handy for say,
   unit tests, without having to fully reify a schema.

 3. However core.typed delivers a more complete reasoning structure
   about the code AND importantly runs at compile time AND reinforces
   the important point that *the correctness of the code should be able
   to be reasoned about at comile time*.  It is pretty good for
   most data types, with the most complex being complex manipulations
   of heterogenous maps, which unfortunately makes up a lot of
   common Clojure code before the maps get reified into records.

 What ti
 4. In terms of laying down initial security layers, I would recommend
   small sets of plain unit tests

   and then simple prismatic/schema
   validators.  Both of these only check *small sets of cases*, but they
   do it easily.  schema gives you a more clearly structured way to
   define and validate a structure (reusable for a number of cases),
   which is very good, BUT does require that you reify a struture,
   which quickly leads to a proliferating numbers of closely-related
   schemas for non-essential data structures.  This is bad.  You want
   to keep the number of named concepts low and powerful.

 6. The new generative testing tools (test.generative and test.check)
   may be useful in conjunction with schema--since they will generate
   a lot of domain check data, and then you can run all of those
   cases through the schema.  This is still "check by case" technically,
   but it's blanketing a lot more cases.

 7. Finally, core.typed is very powerful but very slow and hard to
   change, and the reasoning and debugging must be very careful.
   I would only recommend this for the most fixed and immutable
   parts of the codebase.

   In addition, core.typed may be able to play a useful role
   in eliminating the need for some mocking and stubbing, if you
   just want to be able to assert through inference that the proper
   types of output come from the proper types of input, without manually
   trying to shove in mock objects in as inputs.

 To recap, start and persist with maps and mild unit tests as far as they
 can get you.  Then consider moving up to schema and possibly generative
 testing if necessary.  Only finally move up to core.typed when you're
 really sure it'll be worth the time and security EXCEPT PERHAPS if your
 namespace contains low-hanging fruit--e.g. code is both important
 and simple enough to cover with core.typed in an efficient manner.
	The core of what the system is about is:

	- Having rough/flexible data strutures
	- Performing transformations on those data structures
	- Finding the core "must be correct" structures we want to reason about
	- Defining and naming those structures with relative precision,
	hardening them into definite declarations
	- Validating and verifying that input and output conform to those
	now-hardened structures.

	As the system started developing, it was all about pervasively flexible
	maps, of varying degrees of requirements for the definitions of what was
	in each map.

	Now that we've got a somewhat-maturing system, that we want to be able
	to apply rather large scale restructuring on, and we want to have
	relatively reliable rule-based reasoning about it, we're coming up on
	the problem of defining and validating those core data structures
	and their transformations.

	We have a couple of tools to fulfill these roles:

	- maps are the basic, flexible, embryonic structure that are
	fast and flexible to work with

	- records that start to reify dynamically tuped tuples
	with definite field sets that support abstract protocols

	- prismatic/schema describes compound data structures at runtime
	and allows you to define functions and validators in terms
	of those structures. It is not "pervasive," in the sense of,
	you don't need to annotate an entire namespace, and the validation
	is not always active, and the validation is always in terms of
	"does this one given value to conform to the schema?"

	Another way to think about this is as a specific type of
	runtime contracts library, where the contracts are structural.

	- core.typed describes values and functions at compile time. It
	is pervasive in the senses that a) you need to annotate an entire
	namespace at a time, and its dependencies, b) the reasoning that
	core.typed attempts to do is, like traditional static type checking,
	attempting to reason _for all possible values of the given types_.
	This provides stronger guarantees earlier, but in practice is
	harder to build out, especially if your namespace is changing, or it
	deals with flexible manipulations of complex maps.

	In our system, we're applying these tools to:

	- the backend record structures, which have been around for a little
	while
	- the web API, which we want both:
	- validation that incoming EDN to the API is well structured
	- testing that outgoing EDN is well structured
	- and in general, that functional transformations we expect are indeed
	still happening how we want them, even in the face of code change
	(e.g. compile-time checking and test suite)

	We've previously been accumulating a disorganized mishmash of concepts
	and usage. This commit starts to rectify the situation:

	1. It roughly reconciles the prismatic/schema entries with the core
	defrecord entries in the `report` namespace. schema provides
	a macro to literally define both in the same declaration.

	2. It attempts to do the data transformation to/from the web API in
	terms of these records and schemas

	3. It starts adding some basic tests asserting that the transformation
	functions work in terms of these structures.

	This should give us a much stronger foundation to start standing on in
	terms of solid data structures and data validation.

	Futhermore, a few hard-won understandings came out of the process:

	1. Nothing is more consise or more flexible than Clojure's built-in
	map support. Both schema and core.typed add non-trivial structural
	scaffolding that is, while useful for testing things are still
	the way you want them, time-consuming and annoying to lay down for
	everything, especially if you are trying to rapidly experiment
	with, try out, move things around.

	Everything should begin as maps, and persist that way for quite
	a while, until you're sure that you want to solidify a structure.

	2. schema is nontrivially easier and faster to use than core.typed,
	and more flexible for common cases (e.g. structuring plain maps).
	It is also, I think, more clearly written and documented. It also
	allows for anonymous schemas, which can come in handy for say,
	unit tests, without having to fully reify a schema.

	3. However core.typed delivers a more complete reasoning structure
	about the code AND importantly runs at compile time AND reinforces
	the important point that *the correctness of the code should be able
	to be reasoned about at comile time*. It is pretty good for
	most data types, with the most complex being complex manipulations
	of heterogenous maps, which unfortunately makes up a lot of
	common Clojure code before the maps get reified into records.

	What ti
	4. In terms of laying down initial security layers, I would recommend
	small sets of plain unit tests

	and then simple prismatic/schema
	validators. Both of these only check small sets of cases, but they
	do it easily. schema gives you a more clearly structured way to
	define and validate a structure (reusable for a number of cases),
	which is very good, BUT does require that you reify a struture,
	which quickly leads to a proliferating numbers of closely-related
	schemas for non-essential data structures. This is bad. You want
	to keep the number of named concepts low and powerful.

	6. The new generative testing tools (test.generative and test.check)
	may be useful in conjunction with schema--since they will generate
	a lot of domain check data, and then you can run all of those
	cases through the schema. This is still "check by case" technically,
	but it's blanketing a lot more cases.

	7. Finally, core.typed is very powerful but very slow and hard to
	change, and the reasoning and debugging must be very careful.
	I would only recommend this for the most fixed and immutable
	parts of the codebase.

	In addition, core.typed may be able to play a useful role
	in eliminating the need for some mocking and stubbing, if you
	just want to be able to assert through inference that the proper
	types of output come from the proper types of input, without manually
	trying to shove in mock objects in as inputs.

	To recap, start and persist with maps and mild unit tests as far as they
	can get you. Then consider moving up to schema and possibly generative
	testing if necessary. Only finally move up to core.typed when you're
	really sure it'll be worth the time and security EXCEPT PERHAPS if your
	namespace contains low-hanging fruit--e.g. code is both important
	and simple enough to cover with core.typed in an efficient manner.