[Translation] Why do static languages suffer from complexity?

翻译自 Why Static Languages Suffer From Complexity?

foreword

insert image description here
People in the programming language design community strive to make their languages ​​more expressive, with stronger type systems, primarily to increase the efficiency of code development by avoiding code duplication in the final software, however, the more expressive their languages, The more suddenly repetition penetrates the language itself.
This is what I mean by static-dynamic biformity : whenever you introduce a new language abstraction in your language, it might reside at the static level, the dynamic level, or both . In the first two cases, where the abstraction is only at one specific level, you introduce language inconsistency; in the latter case, you inevitably introduce feature biformity .
As we know, the static level refers to the block of statements executed at compile time. Likewise, dynamic levels are blocks of statements that are executed at runtime. Thus, typical control flow operators (eg if/while/for/return, data structures and procedures procedure) are dynamic, while static type system features (type system features)and syntax macros (syntactical macros)are static. Essentially, most static language abstractions have their counterparts in dynamic space and vice versa:
insert image description here
In the following sections, before elaborating further, let me show you how to implement logic using both static and dynamic methods equivalent program. Most of the examples are written in Rust, but can be applied to any other general-purpose programming language with a sufficiently expressive type system; keep in mind that this article is language-agnostic and focuses on the general PLT philosophy rather than a specific programming language accomplish. If you think there is too much content, you can skip directly to the relevant section.

Record type - Array

Consider everyday use record typesscenarios

struct Automobile {
    
    
    wheels: u8,
    seats: u8,
    manufacturer: String,
}

fn main() {
    
    
    let my_car = Automobile {
    
    
        wheels: 4,
        seats: 4,
        manufacturer: String::from("X"),
    };

    println!(
        "My car has {} wheels and {} seats, and it was made by {}.",
        my_car.wheels, my_car.seats, my_car.manufacturer
    );
}

(The size here Automobilecan be determined at compile time, so it is static record-type– Translator’s Note)
can be used arraysfor the same implementation:

use std::any::Any;

#[repr(usize)]
enum MyCar {
    
    
    Wheels,
    Seats,
    Manufacturer,
}

fn main() {
    
    
    let my_car: [Box<dyn Any>; 3] = [Box::new(4), Box::new(4), Box::new("X")];

    println!(
        "My car has {} wheels and {} seats, and it was made by {}.",
        my_car[MyCar::Wheels as usize]
            .downcast_ref::<i32>()
            .unwrap(),
        my_car[MyCar::Seats as usize].downcast_ref::<i32>().unwrap(),
        my_car[MyCar::Manufacturer as usize]
            .downcast_ref::<&'static str>()
            .unwrap()
    );
}

If we specify the incorrect type to do .downcast_ref, we will encounter panic. But the logic of the program remains the same, it's just that we lift the type checking up to runtime.

Going a step further, we can encode static Automobiletyping into heterogeneous lists heterogenous list:

use frunk::{
    
    hlist, HList};

struct Wheels(u8);
struct Seats(u8);
struct Manufacturer(String);
type Automobile = HList![Wheels, Seats, Manufacturer];

fn main() {
    
    
    let my_car: Automobile = hlist![Wheels(4), Seats(4), Manufacturer(String::from("X"))];

    println!(
        "My car has {} wheels and {} seats, and it was made by {}.",
        my_car.get::<Wheels, _>().0,
        my_car.get::<Seats, _>().0,
        my_car.get::<Manufacturer, _>().0
    );
}

This version enforces automobile-static.rsthe exact same type checking as (the previous code), but also provides Automobilemethods that operate like normal collections! For example, we might want to invert our car:

assert_eq!(
    my_car.into_reverse(),
    hlist![Manufacturer(String::from("X")), Seats(4), Wheels(4)]
);

Or we might want to pull our car with someone else's car:

let their_car = hlist![Wheels(6), Seats(4), Manufacturer(String::from("Y"))];
assert_eq!(
    my_car.zip(their_car),
    hlist![
        (Wheels(4), Wheels(6)),
        (Seats(4), Seats(4)),
        (Manufacturer(String::from("X")), Manufacturer(String::from("Y")))
    ]
);

... etc.
However, sometimes we may wish to apply type computations type-level computation(referring to the type system's type equivalence, type compatibility, type inference, and type native computation – Translator's Note) to ordinary structs and enums, but we cannot This is done because we cannot extract the structure of a type definition (fields fieldsand types/variables types/variantsand their function signatures ) from the corresponding type name , and we cannot provide derived macros for function signaturesthis type if it is external to us . crateTo solve this problem, the Frunk developers decided to create such a procedural macro procedural macrothat Genericinspects the internal structure of a type definition by implementing a generic; it has the ability type Reprto associate a type that, when implemented, is equal to some form of operable HList. Still, all other types (well, transparent types such as DTOs) that don't have this derived macro are unscannable due to Rust's aforementioned limitations.
Have you ever played with Rust?  Hurry up and download it on Stream!

Sum type - Tree

One might find that the sum type is well suited for representing AST nodes:

use std::ops::Deref;

enum Expr {
    
    
    Const(i32),
    Add(Box<Expr>, Box<Expr>),
    Sub(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Div(Box<Expr>, Box<Expr>),
}

use Expr::*;

fn eval(expr: &Box<Expr>) -> i32 {
    
    
    match expr.deref() {
    
    
        Const(x) => *x,
        Add(lhs, rhs) => eval(&lhs) + eval(&rhs),
        Sub(lhs, rhs) => eval(&lhs) - eval(&rhs),
        Mul(lhs, rhs) => eval(&lhs) * eval(&rhs),
        Div(lhs, rhs) => eval(&lhs) / eval(&rhs),
    }
}

fn main() {
    
    
    let expr: Expr = Add(
        Const(53).into(),
        Sub(
            Div(Const(155).into(), Const(5).into()).into(),
            Const(113).into(),
        )
        .into(),
    );

    println!("{}", eval(&expr.into()));
}

tagged treesThe same can be done using :

use std::any::Any;

struct Tree {
    
    
    tag: i32,
    value: Box<dyn Any>,
    nodes: Vec<Box<Tree>>,
}

const AST_TAG_CONST: i32 = 0;
const AST_TAG_ADD: i32 = 1;
const AST_TAG_SUB: i32 = 2;
const AST_TAG_MUL: i32 = 3;
const AST_TAG_DIV: i32 = 4;

fn eval(expr: &Tree) -> i32 {
    
    
    let lhs = expr.nodes.get(0);
    let rhs = expr.nodes.get(1);

    match expr.tag {
    
    
        AST_TAG_CONST => *expr.value.downcast_ref::<i32>().unwrap(),
        AST_TAG_ADD => eval(&lhs.unwrap()) + eval(&rhs.unwrap()),
        AST_TAG_SUB => eval(&lhs.unwrap()) - eval(&rhs.unwrap()),
        AST_TAG_MUL => eval(&lhs.unwrap()) * eval(&rhs.unwrap()),
        AST_TAG_DIV => eval(&lhs.unwrap()) / eval(&rhs.unwrap()),
        _ => panic!("Out of range"),
    }
}

fn main() {
    
    
    let expr = /* Construction omitted... */;

    println!("{}", eval(&expr));
}

Similar to our struct Automobileoperation on , we can use frunk::corproductthe representation

Value - Associated type

!We may wish to negate boolean values ​​using standard operators

fn main() {
    
    
    assert_eq!(!true, false);
    assert_eq!(!false, true);
}

associated typesThe same can be done by

use std::marker::PhantomData;

trait Bool {
    
    
    type Value;
}

struct True;
struct False;

impl Bool for True {
    
     type Value = True; }
impl Bool for False {
    
     type Value = False; }

struct Negate<Cond>(PhantomData<Cond>);

impl Bool for Negate<True> {
    
    
    type Value = False;
}

impl Bool for Negate<False> {
    
    
    type Value = True;
}

const ThisIsFalse: <Negate<True> as Bool>::Value = False;
const ThisIsTrue: <Negate<False> as Bool>::Value = True;

In fact, the Turing-completeness of Rust's type system is based on this principle combined with type induction (as we'll see shortly). Every time you see an ordinary Rust value, know that it has a formal counterpart to its type in the computational sense. Every time you write some algorithm, it has its counterpart on the type system using conceptually equivalent constructs. If you're interested in how, the above article provides a mathematical proof: first, the author implements Smallfuck using the dynamic features: a , sum typepattern matching, recursion, and then using the statics features: logic on traits, associated typesetc.

Recursion-Type-level induction

Let me show you one more example, this time please focus!

use std::ops::Deref;

#[derive(Clone, Debug, PartialEq)]
enum Nat {
    
    
    Z,
    S(Box<Nat>),
}

fn add(lhs: &Box<Nat>, rhs: &Box<Nat>) -> Nat {
    
    
    match lhs.deref() {
    
    
        Nat::Z => rhs.deref().clone(), // I
        Nat::S(next) => Nat::S(Box::new(add(next, rhs))), // II
    }
}

fn main() {
    
    
    let one = Nat::S(Nat::Z.into());
    let two = Nat::S(one.clone().into());
    let three = Nat::S(two.clone().into());

    assert_eq!(add(&one.into(), &two.into()), three);
}

This is the Peano encoding of the natural numbers. In addthe function, we use recursion to calculate the sum, and pattern matching to find out where to stop.
Since recursion corresponds to type induction, and pattern matching corresponds to multiple implementations, the same can be done at compile time (playground):

use std::marker::PhantomData;

struct Z;
struct S<Next>(PhantomData<Next>);

trait Add<Rhs> {
    
    
    type Result;
}

// I
impl<Rhs> Add<Rhs> for Z {
    
    
    type Result = Rhs;
}


// II
impl<Lhs: Add<Rhs>, Rhs> Add<Rhs> for S<Lhs> {
    
    
    type Result = S<<Lhs as Add<Rhs>>::Result>;
}

type One = S<Z>;
type Two = S<One>;
type Three = S<Two>;

const THREE: <One as Add<Two>>::Result = S(PhantomData);

Derivation process (Translator's Note):
Add<Two> -> Result = S<One>
<S<Z> as Add<S<S<Z>>>
Lhs S<Z>
Rhs S<Z>
Three : <S<Z>> as <<S<Z> as S<Z>>::Result
Here, impl ... for Zis the base case (terminating case), impl ... for S<Lhs>and is the inductive step (recursive case) - similar to the pattern matching we use. Also, as shown in the first example, induction works by reducing the first argument to Z<Lhs as Add<Rhs>>::Result: like add(next, rhs)- which again invokes pattern matching to push the computation further. Note that the two trait implementations do belong to the same logic function implementation; they appear to be separate because we perform pattern matching on type-level number( Zand ). S<Next>This is somewhat similar to what we see in Haskell, where each pattern matching case looks like a separate function definition:

import Control.Exception

data Nat = Z | S Nat deriving Eq

add :: Nat -> Nat -> Nat
add Z rhs = rhs -- I
add (S next) rhs = S(add next rhs) -- II

one = S Z
two = S one
three = S two

main :: IO ()
main = assert ((add one two) == three) $ pure ()

Type-level logic reified

Please add a picture description
The purpose of this article is only to convey statics-dynamics biformitythe intuition behind it, not to provide a formal proof - for the latter, see an awesome library called type-operator (by the same guy who implemented Smallfuck on types). Essentially, it's an algorithmic macro eDSL that boils down to type-level operations with traits: you can define algebraic data types and perform data operations on them, similar to how you usually do in Rust, but ultimately, The entire code will stay at the type-level. For more details, see the translation rules and the excellent guide by the same author . Another notable project is Fortraith , which is a "compile-time compiler that compiles Forth into compile-time trait expressions":

forth!(
    : factorial (n -- n) 1 swap fact0 ;
    : fact0 (n n -- n) dup 1 = if drop else dup rot * swap pred fact0 then ;
    5 factorial .
);

The code above turns a simple factorial implementation into a computation on traits and related types. After a while, you will get the result like this:

println!(
    "{}",
    <<<Empty as five>::Result as factorial>::Result as top>::Result::eval()
);

After considering all of the above, it's clear that no matter what you call it, the logic part remains the same: whether it's static or dynamic.

The unfortunate consequenes of being static


Are you quite sure that all those bells and whistles, all those wonderful facilities of your so called powerful programming languages, belong to the solution set rather than the problem set? The wonderful facilities of a programming language that all belong to the solution set rather than the problem set?
Edsger Dijkstra (Edsger Dijkstra, nd)

Today's programming languages ​​don't focus on logic. They focus on the mechanics underlying the logic; they call boolean negation the simplest operator that must exist from the start, but (can be understood negative trait boundsas negated pattern matching or templates, refer to here – Translator's Note) is considered a Controversial concept with "lots of questions". Most mainstream PLs support tree data structures in their standard libraries, but it sum typeshasn't been implemented for decades. I can't imagine ifa single language without operators, but only a few PLs have mature ones trait bounds, let alone pattern matching. It's inconsistent - it forces software engineers to design low-quality APIs that are either dynamic and expose few compile-time checks, or become static and try to circumvent fundamental limitations of the host language, making their use increasingly obscure Difficult to understand. Combining static and dynamic in a single working solution is also complicated because you can't call dynamic features in a static context. In terms of function colors , the dynamic color is red and the static color is blue.

In addition to this inconsistency, we have biformitytraits. In languages ​​like C++, Haskell, and Rust, this biformityamounts to the most perverse form; you can think of any so-called "expressive" programming language as two or more smaller languages ​​put together: the C++ language and C++ templates/macros, the Rust language and type-level Rust + declarative macros, etc. With this approach, every time you write something at the meta level, you can't reuse it in the host language and vice versa, violating the DRY principle (as we'll see in a minute). Additionally, biformitythe learning curve is increased, language evolution is enforced, and ultimately feature bloat occurs where only coders can figure out what's going on in the code. Look at any production code in Haskell, and you'll immediately see those numerous GHC #LANGUAGEclauses, each of which signifies a separate language extension:

{-# LANGUAGE BangPatterns               #-}
{-# LANGUAGE CPP                        #-}
{-# LANGUAGE ConstraintKinds            #-}
{-# LANGUAGE DefaultSignatures          #-}
{-# LANGUAGE DeriveAnyClass             #-}
{-# LANGUAGE DeriveGeneric              #-}
{-# LANGUAGE DerivingStrategies         #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE FlexibleInstances          #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE NamedFieldPuns             #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE PolyKinds                  #-}
{-# LANGUAGE RecordWildCards            #-}
{-# LANGUAGE ScopedTypeVariables        #-}
{-# LANGUAGE TypeFamilies               #-}
{-# LANGUAGE UndecidableInstances       #-}
{-# LANGUAGE ViewPatterns               #-}

When the host language doesn't provide enough static functionality needed to facilitate development, some programmers go especially insane (insane!), creating entirely new compile-time metalanguages ​​and methods on top of existing compile-time metalanguageslanguages eDSL. biformityThus, inconsistency has the dangerous property of translating to
: [C++] We have template metaprogramming libraries like Boost/Hana and Boost/MPL that replicate the functionality of C++ for use at the meta level:

BOOST_HANA_CONSTANT_CHECK(
    hana::take_while(hana::tuple_c<int, 0, 1, 2, 3>, hana::less.than(2_c))
    ==
    hana::tuple_c<int, 0, 1>
);
constexpr auto is_integral =
    hana::compose(hana::trait<std::is_integral>, hana::typeid_);

static_assert(
    hana::filter(hana::make_tuple(1, 2.0, 3, 4.0), is_integral)
    == hana::make_tuple(1, 3), "");
static_assert(
    hana::filter(hana::just(3), is_integral)
    == hana::just(3), "");
BOOST_HANA_CONSTANT_CHECK(
    hana::filter(hana::just(3.0), is_integral) == hana::nothing);
typedef vector_c<int, 5, -1, 0, 7, 2, 0, -5, 4> numbers;
typedef iter_fold<
    numbers,
    begin<numbers>::type,
    if_<less<deref<_1>, deref<_2>>, _2, _1>
>::type max_element_iter;

BOOST_MPL_ASSERT_RELATION(
    deref<max_element_iter>::type::value, ==, 7);

[c] My own compile-time metaprogramming framework, Metalang99, does the same thing by using the C preprocessor (ab). It grew to such an extent that I was forced to reimplement recursion through a combination of Lisp-like trampolineand (CPS) techniques. continuation-passing styleFinally, I have a large number of list manipulation functions in the standard library, such as ML99_listMap, ML99_listIntersperseand ML99_listFoldr, which arguably makes Metalang99, as a pure data transformation language, more expressive than C itself.
[rust] In Automobilethe first example of inconsistency, we used the Frunk library hlist. It's not hard to see that Frunk replicates some of the functionality of collections and iterators just to bring them up to the type-level. It might be cool to apply Iterator::mapor Iterator::intersperseto hlist, but we can't. Worse, if we still want to perform declarative data transformations, we have to maintain a 1-to-1 correspondence between iterator adapters and type-level adapters; hlistA utility is missing in .
[rust] Typenum is another popular type-level library: it allows integer computations to be performed at compile time by encoding integers as generics. By doing this, the part of the language responsible for integers finds its counterpart in statics, thus introducing more biformity. We can't just parameterize certain types with (2 + 2) * 5, we have to write something like this <<P2 as Add<P2>>::Output as Mul<P5>>::Output! The best you can do is write a macro that does the dirty work for you, but it'll just be syntactic sugar - and you'll see tons of compile-time errors with the above characteristics anyway.
Please add a picture description
Sometimes software engineers find their language too primitive to express their ideas even in dynamic code. But they didn't give up:
[Golang]Kubernetes is one of the largest codebases in Golang and implements its own object-oriented type system in the runtime package.
[C] VLC media player has a macro-based plugin API for representing media codecs. Following is the definition of Opus .
[C] QEMU computer emulator builds on its custom object model QObject``QNum``QNull``QList``QString``QDictQBooletc.
Recall the famous Greenspun tenth rule (yes! the one we all know " Any sufficiently complex C or Fortran program contains a temporary, non- Formally specified, error-ridden, slow half-implementation of General Lisp." – Annotation), this hand-crafted metalanguage is often "ad hoc, informally specified, error-ridden, slow", with rather vague Semantics and terrible documentation. The notion of metalanguage abstraction simply doesn't work, although the rationale for creating highly declarative, small domain-specific languages ​​sounds cool at first glance. When a problem entity (or some intermediate mechanism) is expressed in the host language, you need to understand how to chain calls together to get the job done - this is what we usually call an API; however, when this API is written in another language , then, in addition to calling sequences, you need to know the syntax and semantics of the language, which is very unfortunate for two reasons: the mental burden it places on the developer, and the ability to support such a metalanguage The number of developers is very limited. In my experience, handcrafted metalinguistics tend to quickly get out of hand and spread throughout the codebase, making it harder to mine. Not only inference is impaired, but also compiler-developer interaction: have you tried to use complex types or macro APIs? If yes, then you should be completely familiar with incomprehensible compiler diagnostics, which can be summed up in the following screenshot:
insert image description here
Sad to say, but now it seems that "expressive" PL means "hey, I seriously screwed up feature's Quantity, but that's okay!"

Finally, a word must be said about metaprogramming in the host language. Using template systems like Template Haskell and Rust's procedural macros, we can use the same language to process the host language's AST, which biformityis nice in terms of language inconsistency, but unpleasant in terms of language inconsistency. Macros are not functions: we can't partially apply a macro and get a partially applied function (and vice versa), because they're just different concepts - which can be a pain in the ass if we were to design a general and easy-to-use library API. Personally, I do think procedural macros in Rust are a huge design mistake, comparable to #definemacros in plain C: the macro system has no knowledge of the language used at all, other than the pure syntax; you get slightly Enhanced text replacement instead of a tool to extend and use a language gracefully. For example, suppose there is an Eitherenum named enum defined as follows:

pub enum Either<L, R> {
    
    
    Left(L),
    Right(R),
}

Now imagine we have an arbitrary trait Foo, and we want to Either<L,R>implement that trait, Land Rboth. It turns out that we can't apply a derivation macro to Eitherachieve this trait, even if the name is known, because in order to do this, this macro must know Fooall the signatures. Worse, Fooit may be defined in a separate library, which means we cannot Either<L,R>enhance its definition with the extra meta information needed for derivation. While it might seem like a rare case, it's actually not; I highly recommend looking at tokio-util's Either, which is the exact same enum, but implements Tokio-specific, eg traits, AsyncRead AsyncWrite AsyncSeeketc. Now imagine having five different collections from different libraries in your project and Eitherthis would be a headache! While type introspection (the ability to check object types or properties at runtime, maybe you're more familiar with "reflection"? - Annotation) may be a compromise, it still makes the language more complex than it already is.

Idris:The way out?

One of the most fundamental features of Idris is that types and expressions are part of the same language – you use the same syntax for both
. same syntax.
Edwin Brady, the author of Idris (Edwin Brady, nd)

Let's think about how to solve this problem. If we made our language fully dynamic, we would have no problems with biformity and inconsistency, but would quickly lose the ability to verify at compile time and then have to debug our programs in the middle of the night. The pain of dynamic type systems is well known.

The only way to solve this problem is to make one language function both static and dynamic, instead of splitting the same function in two. Therefore, the ideal language abstraction is both static and dynamic. However, it is still a single concept rather than two logically similar systems with different interfaces. A perfect example is CTFE, commonly known as constexpr: the same code can be executed at compile time under a static context, and at runtime under a dynamic context (e.g. when user input is requested). So instead of writing different code for compile time (static) and runtime (dynamic), we use the same representation.

One possible solution I see is dependent types(depending on the type of value, corresponding to the universal quantifier and existential quantifier in predicate logic, the dependent type of strong functional programming is not Turing complete, and vice versa cannot solve the halting problem – Annotation ). With dependent types, we can not only parameterize types with other types, but also parameterize types with values. In the dependently typed language Idris, there is a type called Type- which stands for "types of all types", thereby weakening the dichotomy between type-level and value-level. With such power, we can express typed abstractions, which are usually either built into the language compiler/environment, or done via macros. Perhaps the most common and descriptive example is type safety printf, which dynamically computes the types of its arguments, so let's have fun with that in Idris!

First, define the inductive data type fmtand the method to get it from the format string:

data Fmt = FArg Fmt | FChar Char Fmt | FEnd

toFmt : (fmt : List Char) -> Fmt
toFmt ('*' :: xs) = FArg (toFmt xs)
toFmt (  x :: xs) = FChar x (toFmt xs)
toFmt [] = FEnd

printfLater, we'll use this to generate a type for our function. The syntax is very similar to Haskell and should be understandable to the reader.

Now for the fun part:

PrintfType : (fmt : Fmt) -> Type
PrintfType (FArg fmt) = ({
    
    ty : Type} -> Show ty => (obj : ty) -> PrintfType fmt)
PrintfType (FChar _ fmt) = PrintfType fmt
PrintfType FEnd = String

What does this function do? It fmtcomputes the type based on the input parameters. As usual, we split the case fmtinto three cases and deal with them separately:

  1. (FArg fmt).This case produces a type signature that takes an additional parameter, since FArgit indicates that we will provide a printable parameter:
    1. {ty : Type}means that Idris will automatically deduce tya type for this parameter (implicit parameter).
    2. Show tyis a type constraint that says it tyshould be implemented Show.
    3. (obj : ty)is printfthe printable parameter we have to supply to .
    4. PrintfType fmtis fmta recursive call to process the rest of the input. In Idris, recursive types are managed by recursive functions!
  2. (FChar _ fmt).represents an ordinary character in the format string, so here we ignore it and carry on FCharPrintfType fmt.
  3. FEnd.This is the end of the input. Since we want to printfgenerate one String, we return Stringas a normal type.

Now suppose we have a format string "*x*"or FArg (FChar ('x' (FArg FEnd))); PrintfTypewhat type will be generated? It's simple:

1. FArg:{ty : Type} -> Show ty => (obj : ty) -> PrintfType (FChar ('x' (FArg FEnd)))
2. FChar:{ty : Type} -> Show ty => (obj : ty) -> PrintfType (FArg FEnd)
3. FArg:{ty : Type} -> Show ty => (obj : ty) -> {ty : Type} -> Show ty => (obj : ty) -> PrintfType FEnd
4. FEnd:{ty : Type} -> Show ty => (obj : ty) -> {ty : Type} -> Show ty => (obj : ty) -> String
Cool, now it's time to achieve what we've always dreamed of printf:

printf : (fmt : String) -> PrintfType (toFmt $ unpack fmt)
printf fmt = printfAux (toFmt $ unpack fmt) [] where
    printfAux : (fmt : Fmt) -> List Char -> PrintfType fmt
    printfAux (FArg fmt) acc = \obj => printfAux fmt (acc ++ unpack (show obj))
    printfAux (FChar c fmt) acc = printfAux fmt (acc ++ [c])
    printfAux FEnd acc = pack acc

As you can see, PrintfType (toFmt $ unpack fmt)appears in the type signature, which means that printfthe type of the entire type depends on the input parameters fmt! Butunpack fmt what does it mean? Due to printfusage fmt:String, we should convert it to beforehand List Char, since we are toFmtmatching this string in ; as far as I know, Idris does not allow matching in the same way String. Again, we do it printfAuxbefore unpack fmt, since it also needs List Charto be the sum of the results.

Let's check printfAuxthe implementation:

  1. (FArg fmt).Here we return a lambda function that takes objand invokes showand is then ++appended to by the operator acc.
  2. (FChar c fmt).Just attach cto accand fmtcall again in printfAux.
  3. FEnd.Although accone List Char, we must return String(according PrintfTypeto the last case) we call on it pack.

Finally, test printf:

printf.idr

main : IO ()
main = putStrLn $ printf "Mr. John has * contacts in *." 42 "New York"

This will print Mr. John has 42 contacts in "New York". But what if we don't provide 42it?

Error: While processing right hand side of main. When unifying:
?ty -> PrintfType (toFmt [assert_total (prim__strIndex “Mr. John has * contacts in *.” (prim__cast_IntegerInt (natToInteger (length “Mr. John has * contacts in *.”)) - 1))])
and:
String
Mismatch
between: ?ty -> PrintfType (toFmt [assert_total (prim__strIndex “Mr. John has * contacts in *.” (prim__cast_IntegerInt (natToInteger (length “Mr. John has * contacts in *.”)) - 1))]) and String.
test:21:19–21:68
17 | printfAux (FChar c fmt) acc = printfAux fmt (acc ++ [c])
18 | printfAux FEnd acc = pack acc
19 |
20 | main : IO ()
21 | main = putStrLn $ printf “Mr. John has * contacts in *.” “New York”
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning: compiling hole Main.main

Yes, Idris detected an error and produced a type mismatch! This is basically the way first-classto achieve type safety with types printf. If you're curious about the same in Rust, take a look at Will Crichton's attempt, which relies heavily on the heterogeneous lists we saw above. The downside of this approach should be pretty clear by now: in Rust, the language for the type system is different from the main language, but in Idris, it's really the same thing – which is why we're free to define type-level functions as returning a type of regular functions, and call them later in the type signature. Also, since Idris is dependently typed, you can even compute types from certain runtime parameters, which is not possible in languages ​​like Zig.
insert image description here
I've anticipated this question: printfwhat's the problem with using the macro implementation? After all, println!it works just fine in Rust. The problem is with the macro. Think about it: why do programming languages ​​need heavy macros? Because we might want to extend it. Why should we extend it? Because a programming language didn't fit our needs: we couldn't express something using regular language abstractions, that's why we decided to extend the language with ad-hoc meta-abstractions. In the main part, I provide an argument why this approach sucks - because the macro system has no clue about how the language works; in fact, procedural macros in Rust are just a fancy name for the M4 preprocessor. You have integrated M4 into your language. Sure, that's better than an external M4, but it's still a 20th-century approach. Also, macros can't even manipulate abstract syntax trees, syn::Itemwhich are indeed called concrete syntax trees, or "parse trees", as is a common structure for writing procedural macros. Types, on the other hand, are a natural part of the host language, which is why if we can use types to express programming abstractions, we reuse language abstractions instead of resorting to ad-hoc mechanisms. Ideally, a programming language should have no macros, or only lightweight syntax rewriting rules (such as Scheme's extended syntax or Idris' syntax extensions), to keep the language consistent and well suited to solve intended tasks.

Having said that, Idris Typeeliminates the first biformity "value generics" by introducing "types of all types" values-generics. By doing so, it also resolves many other correspondences, such as recursion vs. type-level induction, functions vs. the trait mechanism, etc.; in turn, this allows as much as possible to be programmed in the same language, even when dealing with highly generalized code is also like this. For example, you can even represent a typed list as List Type, like List Nator List String, and handle it as usual! This may be due to the cumulative hierarchy of universes (see below). Since the generic name of Data.List ais "implicitly" typed Type, it Typecan be either as, Nator String; in the latter case, awill be deduced as Type 1. Such an infinite sequence of types is needed to avoid Russell's paradox of variation making inhabitant"structurally smaller" than its type.
However, Idris is not a simple language. Our twenty-line printfexample has used "the whole lotta feature", such as inductive data types, dependent pattern matching, implicits, type constraints, etc. In addition, Idris has computational effects, elucidator reflections, empathy data types, and much more for theorem proving. With so many tools, you're usually fiddling with your language instead of doing meaningful work. I find it hard to believe that, in their current state, dependent languages ​​find a lot of production use; as for now, in the programming world, they are little more than a A fancy thing. Dependent types themselves are too low-level.
insert image description here

Zig:Simpler,but to systems


In Zig, types are first-class citizens. They can be assigned to variables, passed as parameters to functions, and returned from functions. They can be assigned to variables, passed as arguments to functions, and returned from functions.
The Zig manual (Zig developers, nd)

Our last patient is the Zig programming language. Here's printfthe compile-time implementation in Zig (sorry, not highlighted yet):

const std = @import("std");

fn printf(comptime fmt: []const u8, args: anytype) anyerror!void {
    const stdout = std.io.getStdOut().writer();

    comptime var arg_idx: usize = 0;

    inline for (fmt) |c| {
        if (c == '*') {
            try printArg(stdout, args[arg_idx]);
            arg_idx += 1;
        } else {
            try stdout.print("{c}", .{c});
        }
    }

    comptime {
        if (args.len != arg_idx) {
            @compileError("Unused arguments");
        }
    }
}

fn printArg(stdout: std.fs.File.Writer, arg: anytype) anyerror!void {
    if (@typeInfo(@TypeOf(arg)) == .Pointer) {
        try stdout.writeAll(arg);
    } else {
        try stdout.print("{any}", .{arg});
    }
}

pub fn main() !void {
    try printf("Mr. John has * contacts in *.\n", .{ 42, "New York" });
}

Here we use a feature called comptime: comptimea function argument means it must be known at compile time. Not only does it allow aggressive optimization, but it also opens up a temple of "metaprogramming" facilities, most notably no separate macro-level or type-level sublanguages. The above code needs no further explanation - the simple logic should be clear to every programmer, and doesn't printf.idrseem like the fruit of a mad genius' fancy.

If we omit 42, Zig will report a compilation error:

An error occurred:
/tmp/playground2454631537/play.zig:10:38: error: field index 1 outside tuple 'struct:33:52' which has 1 fields
            try printArg(stdout, args[arg_idx]);
                                     ^
/tmp/playground2454631537/play.zig:33:15: note: called from here
    try printf("Mr. John has * contacts in *.\n", .{ "New York" });
              ^
/tmp/playground2454631537/play.zig:32:21: note: called from here
pub fn main() !void {

printfThe only inconvenience I encountered during development was huge errors ...much like C++ templates. However, I concede that this could be solved (or at least be able to cover everything) with more explicit type constraints. Overall, the design of Zig's type system seems sound: there is one type for all types called type, and using comptime, we can compute the type at compile time with regular variables, loops, procedures, etc. We can even perform type reflection via @typeInfo, @typeNameand @TypeOfbuiltins! Yes, we can no longer depend on runtime values, but if you don't need a theorem prover (theorem prover), then the full dependency type might be overkill.
Please add a picture description
Everything is fine, except that Zig is a systems language. On their official website, Zig is described as a "general-purpose programming language", but I have a hard time agreeing with that statement. Yes, you can write almost any software in Zig, but should you? My experience maintaining high-level code in Rust and C99 says "no". The first reason is security: if you make your systems language safe, you'll make programmers deal with borrow checkers and ownership (or equivalent) issues that are completely unrelated to business logic borrow checker(trust me, I know how painful it is); If you choose the C way of manual memory management, you'll have programmers debugging their code for a long time and hoping -fsanitize=addressto show something meaningful. Also, if you were to build new abstractions on top of pointers, you'd end up with &str, AsRef<str>, Borrow<str>, Box<str>something like that. Please, I just want a UTF-8 string; most of the time, I don't really care if it's one of these alternatives.

The second reason has to do with the language runtime: for a language to avoid hidden performance penalties, it should have a minimal runtime - no default GC, no default event loop, etc., but for a particular application program, it may be necessary to have a runtime - for example, to deal with an asynchronous runtime, so you actually have to deal with custom runtime code somehow. Here we run into a whole new set of problems with function coloring (see above): For example, having no facilities in your language to abstract synchronous and asynchronous functions means that you split your language into two parts: synchronous and asynchronous, For example, if you have a generic high-level library, it will inevitably be marked to asyncaccept various user callbacks. In order to solve this problem, you need to implement some form of effect polymorphism(for example, monads or algebraic effects algebraic effects), which is still a research topic. High-level languages ​​inherently have fewer problems to deal with, which is why most software is written in Java, C#, Python, and JavaScript. In Golang, conceptually, every function is async, so the default helps to maintain consistency without resorting to complex type traits. In contrast, Rust has been recognized as a complex language, and there is still no standard way to write truly general-purpose asynchronous code.

Zig can still be used in large system projects such as web browsers, interpreters, and operating system kernels - nobody wants these things to freeze unexpectedly. Zig's low-level programming capabilities will facilitate convenient manipulation of memory and hardware devices, while its sound metaprogramming approach (in the right hands) will foster understandable code structures. Introducing high-level code will only increase the mental burden without providing measurable benefits.

Progress is possible only if we train ourselves to think about programs without thinking of them as pieces of executable code
.
Edsger Dijkstra

epilogue

Static languages ​​enforce compile-time checking; which is fine. But they suffer from characteristic biformity and inconsistency - which is bad. Dynamic languages, on the other hand, suffer from these shortcomings to a lesser extent, but they lack compile-time checking. The hypothetical solution should take the best out of the best of both worlds.

Programming languages ​​should be rethought.

Replenish

Add some language featureintroduction

borrow

Borrowing, which appears in rust
Borrowing can only be one or more references to resources or a mutable reference
borrow's scope is smaller than the owner's scope

tuple structure

Appearing in rust,
the form is a structure of tuples, and its meaning is to deal with simple data that needs to define types (often used) but does not want to be too complicated:

struct Color(u8, u8, u8);
struct Point(f64, f64);

let black = Color(0, 0, 0);
let origin = Point(0.0, 0.0);

Phantom Data

In Rust only,
PhantomData is a tagged struct of type zero size.

Role:
unused type;
type change;
mark ownership relationship;
automatic trait implementation (send/sync);

Opaque and protocol types

Take swift as an example
Protocol type: a type that supports a set of methods
Opaque type: hides the type information of the return value, the compiler can access it, but the client cannot

implicit

Taking Scala as an example,
scala implicitis used to implicitly pass parameters, including the implicit value of the function, implicit view (used for implicit conversion of parameter types), and implicit conversion (calling methods that do not exist in the class)

traits

Take rust as an example

case classes

Taking Scala as an example,
case classes are good at modeling immutable data.
The case class has an apply default method for instantiating case classes.
Case classes are compared by structure rather than by data.

Monoid scale

Taking scala as an
example
, a monoid (monoid) is a set of binary operations and identity elements that satisfy the associative law.

Scala Context Bounds

New feature introduced in Scala 2.8, often type class patternused with the type class pattern

//等价于
def foo[A](a:A)(implicit b:B[A]) = g(a)
// 将B折叠到A做隐式值传递
def foo[A : B](a: A) = g(a) 

Because the implicit parameter value cannot be passed explicitly after using context bound, you need to use the implicitlyidentifier to obtain the implicit value of the type in the context

  def fol1[F[_], A](list: F[A])(m: Monoid[A])(implicit f: Foldable[F]): A = {
    
    
    f.foldleft(list)(m.zero)(m.combine)
  }
//-->>
def fold[F[_]: Foldable, A](list: F[A])(m: Monoid[A]): A = {
    
    
  implicitly[Foldable[F]].foldleft(list)(m.zero)(m.combine)
}
// impliit参数被隐式传递了

IsoMerism

I don't understand this code

// A pair of arbitrary case classes
case class Foo(i : Int, s : String)
case class Bar(b : Boolean, s : String, d : Double)

// Publish their `HListIso`'s
implicit def fooIso = Iso.hlist(Foo.apply _, Foo.unapply _)
implicit def barIso = Iso.hlist(Bar.apply _, Bar.unapply _)

// And now they're monoids ...

implicitly[Monoid[Foo]]
val f = Foo(13, "foo") |+| Foo(23, "bar")
assert(f == Foo(36, "foobar"))

implicitly[Monoid[Bar]]
val b = Bar(true, "foo", 1.0) |+| Bar(false, "bar", 3.0)
assert(b == Bar(true, "foobar", 4.0))

Guess you like

Origin blog.csdn.net/treblez/article/details/122760463