Visiting the TensorFlow team at Google brought up the question of how array programming with Remora differs from array programming with Numpy and Pandas. Pandas-style data frames, a popular tool in data science, can be expressed in a rank-polymorphic language as arrays of conventional record data. Since records were not part of the original design for Remora, this paper explores the design space to see how best to enable programming with data frames in a future extension of Remora.
Working on type inference for Remora hit a particular snag: when the compiler sees use of a shape-polymorphic function, it needs to be able to identify the implicit shape arguments to instantiate the polymorphic function correctly. While the full theory of shapes (that is, sequences of numbers) is undecidable, this paper explains how the structure of Remora makes it possible for constraint generation to stay within a fragment of the theory where the universal quantifiers can be eliminated.
This is the beginning of the Remora project, presenting the semantics of rank polymorphism embedded in a λ-calculus setting. The mathematical formalism was developed alongside an executable model in PLT Redex, which is linked above. Later work includes cleaned-up notation and the introduction of a kind system to clarify certain restrictions on type quantification.