Templating engine for app configuration helpers

Aleks · May 20, 2018, 5:03pm

This is a follow-up of a discussion started here and here

Problem

We currently handle configuration template with rules like these, i.e. have a file with __PATH__, __PORT__, etc… at the appropriate places, then the helpers do a search-and-replace of these keywords with their actual values.

This is fine for the majority of apps and historically was simpler to implement, but nevertheless :

currently we are basically implementing our own template engine, in every helper that need that kind of feature (nginx, phpfpm, systemd, and maybe others already or in the future) ;
there are historical non-trivial conversion of e.g. $path_url to __PATH__, not to be mistaken for __FINAL_PATH__ which comes from $final_path… One can only understand those by reading the doc of the helper or reading other people’s template ;
there’s a constrain on which variables can be used in which templates (we only replace a very specific list of keywords depending on the template). Some apps need more variables, and that led to the initial PR which introduces a hack, where instead it would make sense to just be able to use any variable ;
for element of logics (e.g. adapt the template if __PATH__ is just /) we have to put this logic outside the template, and use other tricks to comment/decomment the proper lines instead of having this logic inside the template ;

Proposal

So the proposal is to implement or use a proper template engine, which the other helpers would them rely upon to “render” the template.

Some people already started to use Jinja2, which is kind of “the current standard” (?) for templates as far as I know … Applied to a nginx conf, this gives something like this.

While discussing this, it should be understood that it’s not just a matter of “people can just define their own helper if they want to” : not using the official helpers mean that you won’t benefit from their evolutions in the future (like the automatic transition from php5 to php7 for stretch) and therefore increase the maintenance effort required to keep the app alive.

So here I try to list the pro and cons of using Jinja2 … though of course I am very biased. But at least that should get the discussion (re)started

Pros

Have a single template engine for all helpers
Jinja2 is used in many other projects so people can use their knowledge of it (or learn it) when writing apps
Direct correspondence between the bash variable and the template : $path_url <-> {{ path_url }}
Be able to use any variable (as long as you exported it previously)
Be able to manage the logic of the template inside it (e.g. if blocks)

Cons

Less readable than the __VAR__ syntax, especially when used in e.g. nginx templates which might include complex regex stuff / rewriting rules…
???

Maniack_Crudelis · May 20, 2018, 6:32pm

As I said before, I’m not against jinja2 itself, I’m against a syntax difficult to read, difficult to use, and globally something that makes packaging more complicated.

I’ve started to read quickly the documentation, and I can’t say that I’m relieved about the complexity of jinja2.
http://jinja.pocoo.org/docs/2.10/templates/
For me, jinja2 is far too much tinted by python, there’s a lot of parts of the documentation not understandable if you don’t know python.

But, I agree that the PR #462 shows that our helpers and our current template isn’t perfect.
So, as I already said before, if we can change the default syntax:

{{ a_variable }}

to something easier to see in the middle of the big config file, currently

__A_VARIABLE__

It could be interesting to use jinja2.

As far as I understand, it’s possible to change this syntax by modifying variable_start_string and variable_end_string
http://jinja.pocoo.org/docs/2.10/api/#jinja2.Environment
But still, I’ve no fucking idea how to do that… And I maintain that this documentation is made for python adepts… But we’re using bash for apps.

So, if I agree that

is clearly something we would have for our helpers.

But I totally disagree with

As the developers of jinja2 (here), I think we should keep the logic outside of the template, because, first we can do that easily in our case, second if the logic is inside the template, you’re not going to see what’s going on when you’re reading the app’s scripts.
And it’s really (really REALLY) important to keep our apps easy to read and understand as much as possible.

Unfortunately, I’m quite sure that if we implement jinja2, many packagers will use jinja2 to add logic blocks inside of the templates. And it’s again going to be a fucking hell to maintain those apps after them…
As it was for many official apps before refactoring.

rndmh3ro · May 24, 2018, 7:14am

Using jinja2 is a good idea instead of using your own templating language!

You should also not change the variables from {{variable}} to something else to make it easier.
The reason is that in fact it only makes it easier for people who never wrote a helper at all.

For everyone else, using not the normal jinja-syntax will be confusing! Trying to find help online without the normal syntax would be harder, too.

So please: use jinja in its default!

Josue · May 24, 2018, 12:53pm

Hello,

On my side, I don’t have a specific point of view about using or not jinja.

For me the most important thing is keep the same template everywhere in all app and helper. The problem might be if we migrate to jinja to have some apps who use only the actual template and some others jinja and maybe some app might use the both system.

I also think that if we want to migrate to that we need to think about how we will migrate own actual helper to stay compatible with all apps.

jibecfed · May 26, 2018, 7:50am

Thanks for this great proposal
.

I’m not sure either having a templating system for this use case is a good idea.
Risks of too much complexity is too high compare to the benefits.
.

But I think there is another use case for this template system: inform admin and users about what’s installed on the server.

What I created a few month ago is a helper to send an email to the user with important information about the just installed package. Maniack improved it and it is now in 56 packages:

https://github.com/search?q=org%3AYunoHost-Apps+ynh_send_readme_to_admin&type=Code

What I think we can do it to build a app.info so the information is available over time.
And as jinja2 supports localization, we should plan from the beginning internationalization (and think later about how to make this work with translate.yunohost.org).

This probably will help us to better respect licenses asking you to give access and information to the package source.

Josue · May 31, 2018, 6:39pm

Hello,

To resume actually the advantage of Jinja2 is :

Support of custom variable
More standardized and used in other project
Make the helpers for the service more simple and generic.

But the disadvantage is :

Black box effect : we don’t know in the script what is done in the config file.
More complicated syntax
Migration from own actual template

And there are some advantage for some people and disadvantage for some others :

Support of if / else statement

In all case it might be complicated to take a decision which is agreed by all packager but I think we will need to take it.

So what does we decide ? Does we do a vote ?

Maniack_Crudelis · May 31, 2018, 7:19pm

I think also that we should think about what are the problems with the current template engine. Before thinking about how to fix them.

But, I can see only one problem with this template engine, the variables problem, which was pointed by Josue in a PR (sorry I’m on my phone I can’t find this PR).

The non standart issue isn’t an issue for me since jinja2 would have to be learned by most of our packagers. So there’s no difference.
And I’m not so sure that jinja2 is widely used in bash.

Josue · May 31, 2018, 7:28pm

But, I can see only one problem with this template engine, the variables problem, which was pointed by Josue in a PR (sorry I’m on my phone I can’t find this PR).

Yes, but I also think that a code which look like that is not really elegant :

 ynh_add_nginx_config
if [ "$path_url" != "/" ]
then
	ynh_replace_string "^#sub_path_only" "" "/etc/nginx/conf.d/$domain.d/$app.conf"
fi
ynh_store_file_checksum "/etc/nginx/conf.d/$domain.d/$app.conf"

And this is going to be in all app !!

I just think that it could be good to take one time a decision before to put something like that in all app and after we need to change a new time in all apps. But it’s stay my point of view.

Aleks · May 31, 2018, 8:38pm

That’s what was explained in the original topic (though it’s from my point of view)

code duplication related to templating
historical stuff (e.g. $path_url -> __PATH__)
limited list of variables that can be used
template-related logic being outside of the template

About the question of putting the logic inside of outside the template : to me, it all ends up to which logic we are talking about. There are two kinds of logic :

logic about pre-processing your data (e.g. computations, filters, sorting, …)
logic about rendering your template

To illustrate the first case, think about some data stored in a database and you want to render this in a webpage. Well first you query the database, keep only the entries with foo>=5 and sort them according to bar in ascending order. This, you should not do in the template, because this treatment in independent of the way you’re going to render it (and other pieces of software might need the same data so you want to keep it outside the template)…

To illustrate the second case : now you want to render the list in a nice table on your web page. But you should also consider the edge-case where the table is empty. In that case, instead of the table, you want a message such as “There is no data to display”. So what do you do ? Inside your template, you do something like this :

[IF LIST IS EMPTY]
"These is no data to display"
[ELSE]
    [FOR ELEMENT IN LIST]
        "[ELEMENT.ID] has foo=[ELEMENT.FOO]"
    [ENDFOR]
[ENDIF]

This logic goes into the template, because it is specific to the way you want to render things in the specific context of this template. You might have chosen to simply display an empty table instead of a message.

What we are talking about is essentially the second case : for instance we have our $path_url and sometimes need to handle the edge case where it’s only / and adapt the final result. In fact, we already put the logic inside the template, via a #sub_path_only thing. But because we are not using a template engine, we need to add code outside to actually implement the corresponding behavior (if … then replace #sub_path_only by empty string). So basically we are just hacking our own template engine. And it is or will be :

either not practical (having to add (or read) code outside the template each time you want a if-like structure)
either not flexible (we could hard-code the behavior of #sub_path_only in the helpers, but then somebody will come up with a different case which will also need to be added to the helper, or then we decide to code our own __IF__).

So I disagree with the statement that “logic inside the template is bad because you’re not going to see what’s going on when you’re reading the app’s scripts.” or similarly “Black box effect : we don’t know in the script what is done in the config file.”. The purpose of putting the logic inside the template is precisely to factorize the template from the main code to make everything easier :

to understand the template, you don’t need to read the rest of the code, because the template is self-explanatory (modulo of course having an idea of what is path_url for instance).
to understand the rest of the code, you don’t need to know what’s inside the template. When you are trying to understand a general installation procedure, you don’t care knowing that a node-js conf file (or whatever) has specific tweaks if $path_url if /…

To comment on other points that were brought :

Imho this is not a battle of “python vs bash” and neither about “having to learn a whole complete new language, hence is complicated”. In fact, what we are talking about is how we manage configuration files, and those configuration files come in all variety and complexity - I mean it’s not as if packagers where only able to speak bash. To package an app, you need to learn / understand command line interfaces, nginx configurations, systemd configuration, php configurations, SQL databases, regex, other things, and the whole adminsys chaos in general. What we are talking about here is : essentially having to write {{ variable }} instead of __VARIABLE__. Then there is the occasional {% if foo %} some stuff {% endif %}. I mean we’re not talking about having everybody coding nested for loops and specific jinja filters.
Agreed that the syntax might no be as readable as __VARIABLE__. I did some tests and changing delimiters from {{ to __ works … but you have to put spaces, so one would have to write __ VARIABLE __. Also it’s not really recommended to change the delimiter because they are a recognizible convention. A middle-ground could be to keep the current delimiters but have the variable in caps, which imho is 70% of the readability. So something like {{ VARIABLE }}.
about the migration issue : we could either have a special extension, e.g. .j2, or else detect the presence of e.g. __PATH_URL__ or {{ path_url }} to apply the corresponding template technique.

Maniack_Crudelis · June 1, 2018, 9:51am

We’re using the helper ynh_replace_string everywhere we need an action for the template. I don’t see any duplication.

jinja2 isn’t going to fix that. The only way to regularize that would be to change PATH to PATH_URL.

Agree

Yes you do !!!
When an install fails, the first thing we’re going to do is to read the install script to find if there’s any mistake with the code or the template engine.
And, a thing that everyone here seems to forgot is, most of the time it isn’t your code, so you have to understand how the script has been built by the packager. I did that many time, and external functions (_common.sh) and external template logic are really a pain in the ass when you try to understand how it works and especially when you’re trying to find where the bug is hiding.
The same problem occurs to me a few day ago when I tried to git blame an install script to find an old error in the script. But, I dug into it until factorized functions in common. I lost 10 more minutes to play with git history and blame between files to find the involving commit.

So, it’s not that you will have to “read code outside the template”, but the problem will be that you will have to read code outside the script.
A template is easier to read with:

#sub_path_only rewrite ^__PATH__$ __PATH__/ permanent;

Which is clear for everyone will read the comment.
No one will need to have a look to the install script to understand what’s going to happen with this. The comment is self explaining.

Than:

{% if path_url != "/" %}
rewrite ^{{ path_url }}$ {{ path_url }}/ permanent;
{% else %}
{% set path_url = "" %}
{% endif %}

Which can be understandable if you take time to read and analyze the code. But, when you’re reading the install script, you’re just not going to know that going to be done in the nginx config file. And that’s a problem, because the install script (and other scripts) are the main thing when you’re looking to an app.

Not everybody will use other specific features from jinja2, but some will. And that’s going to be a problem when we will have to read that code.
Apps are really often changing their maintainers, either definitively or just to fix a bug. That kind of specific code (globally all is about factoring the code somewhere else than the main scripts) is really the worst thing I’ve ever had to do with.

Because we have a real problem with the limited list of variables that could be used, I agree that we should to something about that. But, using something that really going to add a huge complexity and discomfort isn’t a solution for me.
I’d rather prefer to code a simple and efficient helper to use our own and simple template and be able to use any variable we want.

JimboJoe · June 1, 2018, 11:58am

I’ll try to step into the debate after some time to think about it… and after having tried to implement the alias traversal fix :

First, when you pointed to the jinja FAQ, I think you missed the point… there is no definitive answer as to wether mixing logic and templates is a good or a bad solution (see here)
We have (as the group apps) to admit we’re a lot used to the actual templates… and obviously you the most as you put a tremendous effort into refactoring almost every official app last summer (thanks so much for that)!
That’s why we’ll feel quite confortable with the current system and see a new system as a time investment.
Moreover, we’re used to bash as well… which is for me a very good example of difficult (to say the least) language to learn and use… that’s already a HUGE learning gap for newcomers in my opinion.
As you’re pointing out the effort to become the maintainer of an abandoned app, there will always be an effort to put in. We’re aiming at reducing this effort by putting good practices/helpers in place, but as any maintainer can do whatever he/she wants with bash, there will always be an effort.

That being said, what’s my particular opinion on that particular point (so I’m sparing the “I think” at the beginning of every sentence ):

I can’t say our solution to alias traversal with #sub_path_only is stellar! The nginx change is not so obvious, and the bash code to add both to install AND upgrade script is just horrible.
In this situation, I think the jinja template solution is better. I also think it’s better in any “self-sufficient” configuration file, such as nginx, php-fpm or systemd. Because: you only need to act on one file (vs install/upgrade… and also change_url, and maybe later other scripts dedicated to specific configuration as proposed yesterday by @Bram), and the syntax doesn’t seem more complicated to me.
Regarding the syntax… it’s not a worldwide standard, but at least it’s common between Jinja (for python), Twig (for PHP), Mustache (for anything, and used in yunohost-admin!), and that’s already far more standard than our own syntax…

There’s also another debate that’s been collaterally triggered here: is it better or not to factorize code of apps… ?
Here again, I think it depends on our diverse experiences. I personally didn’t care when the factorization was removed from synapse, but I would say my personal preference tends toward more factorization.
Yes, it can be difficult to git blame on several sources, but that solution can’t certainly scale to programs with thousands lines of code…
And having scripts with comment saying “if this code is modified, don’t forget to change in that scripts as well” looks overall like something that won’t scale on the long term (see my prior example of install/upgrade/change_url/future scripts).

In this situation, having some other opinions from other apps maintainers would be great, and would maybe help us move forward in that interesting and constructive debate!

Maniack_Crudelis · June 1, 2018, 2:26pm

Just a quick and dirty POC to propose a fix for the problem with limited number of variables.

#!/bin/bash

ynh_replace_variables () {
	local variables_string="$1"
	local file_to_change="$2"
	local first_delimiter="__"
	local last_delimiter="__"

	# Parse variables list
	# Get the number of variable by using the special awk variable NF. Which count the number of fields.
	local nb_var=$(echo "$variables_string" | awk '{ print NF }')
	# Then separate each variable
	local var_nb
	local variables_array
	for var_nb in `seq 1 $nb_var`
	do
		# Get each variable by using awk with print $1, $2 etc...
		# And store them separately in the array $variables_array.
		variables_array+=( "$(echo "$variables_string" | awk "{ print \$$var_nb }")" )
	done

	# Build another array with corresponding string to replace
	# By using the same name of the previous given variables set as uppercase
	local replacement_array
	for var_nb in `seq 0 $(( $nb_var - 1 ))`
	do
		# Get each variable by using awk with print $1, $2 etc...
		# And store them separately in the array $variables_array.
		# Each variable will be set as uppercase by `^^`.
		replacement_array+=( "${first_delimiter}${variables_array[$var_nb]^^}${last_delimiter}" )
	done

## DEBUG ONLY
cp $file_to_change ${file_to_change}.mod
file_to_change=${file_to_change}.mod
## DEBUG ONLY

	# Replace effectively variables in the file
	for var_nb in `seq 0 $(( $nb_var - 1 ))`
	do
		# Replace each variable by using ynh_replace_string
		# Call variables_array[$var_nb] with ! before to get the content of the variable instead of the variable itself
		ynh_replace_string "${replacement_array[$var_nb]}" "${!variables_array[$var_nb]}" "$file_to_change"
	done
}

# ynh_replace_string from https://github.com/YunoHost/yunohost/blob/unstable/data/helpers.d/string#L13-L34
ynh_replace_string () {
	local delimit=@
	local match_string=$1
	local replace_string=$2
	local workfile=$3

	# Escape the delimiter if it's in the string.
	match_string=${match_string//${delimit}/"\\${delimit}"}
	replace_string=${replace_string//${delimit}/"\\${delimit}"}

	sed --in-place "s${delimit}${match_string}${delimit}${replace_string}${delimit}g" "$workfile"
}

path=mypath
port=8080
ynh_replace_variables "path port" "nginx.conf"

app=monapp
env_path=$PATH
finalpath=/var/www/$app
ynh_replace_variables "app env_path finalpath" "systemd.service"

language=en
ynh_replace_variables "port language" "settings.json"

ynh_replace_string is added only to get rid of source /usr/share/yunohost/helpers in this POC.
Below the function, 3 examples on files from etherpad. nginx.conf, systemd.service and settings.json

Here, we could easily change the delimiter if we want.
The variables to replace are defined by the packager himself, and anyone can read easily which variables are going to be replaced. There’s no limit to the number of variables to change.
That’s easy to use and clear to read.
And, it does ONLY its job, which is replacing variables, nothing more.

(Quickly tested, coded in half a hour, so just a POC)

Josue · June 1, 2018, 3:32pm

Personally after reading the different arguments I also think that Jinja is a better solution.

Josue · June 9, 2018, 7:23pm

And why didn’t you say anything about that when we was speaking about that here : Your application has been elected to become official ! · Issue #31 · YunoHost-Apps/synapse_ynh · GitHub

JimboJoe · June 10, 2018, 9:00am

Well, simply because I was very busy at that time and about to be away from keyboard from several weeks… added to the fact that I value @Maniack_Crudelis’s standpoint as well

JoclelynD · June 17, 2018, 8:19am

Hello here, I’m a really occasional packager, but here is my personal feedback on that.

background : i’m more a web developper (python) than a sysadmin, but i use to do both activities.

I think jinja/django/mustache template syntax is quite widespread both for web developers and for a part of sysadmins (it is ansible and chef syntax). I did no sociological study but it seems that the population of packagers inculde a lot of web devs.

As @aleks pointed out, things like loops can really be a valuable bonus. Another good point in favor of jinja is : it is already documented.