issue: log error if still using issue_pat_XXX configuration after it was removed in d24051ce961c
Show a helpful: found unsupported issue_prefix_pr = 'PR' - use issue_sub_pr instead before bailing out with: skipping incomplete issue pattern 'issue_pat_pr': '(?:PR\\s*)(\\d+)' -> 'https://kallithea-scm.org/repos/kallithea/pull-request/{id}' None
scripts/i18n: update i18n howto with recipe for use of scripts/i18n tooling
For now, just adding a new section. It should be integrated with the existing documentation. Some steps can perhaps be simplified or automated so we need less words.
uwsgi: slim down and tweak the default '[uwsgi]' configuration section
The goal is to have a basic working setup that show how and why to use uWSGI. System administrators should check uWSGI documentation for further information and general advice about operations.
uwsgi: drop unnecessary dependency of http module - just use http-socket directly
The http plugin has more advanced http functionality like https and load balancing, duplicating what in many setups is handled by separate front-end servers. In this simple template, just use the basic http-socket.
i18n: fix dead code in Accept-Language workaround from 7c7d6b5c07c7
AppConfig is just providing defaults, and the .ini file has not been read yet. There is thus no point in checking if i18n.lang has been set, and the code would thus *always* set i18n.lang=en ... but that was fine, as it just is a default.
email templates: fix missing translation of titles and buttons
The buttons and titles of email templates were not correctly translated. The corresponding strings were not part of the i18n files because they were not recognized by the extraction logic.
The benefit of this functionality is questionable. Especially in bigger setups with multiple front-end instances all serving the same multitude of repositories, making the hit rate very low. And the overhead of storing cache invalidation data *in* the database is non-trivial.
We preserve a small cache in Repository SA records, but should probably just in general know what we are doing and not ask for the same information multiple times in each request.
setup: install pip in virtualenv to make sure we have the latest version
Older versions are good enough for bootstrapping (and might also be good for everything else) but gives:
WARNING: You are using pip version 19.3.1; however, version 20.0.2 is available. You should consider upgrading via the 'pip install --upgrade pip' command.
The user might still get this warning initially, but then it goes away ...
config: set base_path config in set_app_settings using Ui.get_repos_location() instead of in app_cfg using make_ui()
Only hit the database once (when starting the application) to get this.
It would perhaps be more elegant to set it directly, for example in kallithea.base_path ... but it seems like a setting that really belongs in the .ini file ...
auth: for default permissions, use existing explicit query result values instead of following dot references in ORM result objects
There has been reports of spurious crashes on resolving references like .repository from Permissions:
File ".../kallithea/lib/auth.py", line 678, in __wrapper if self.check_permissions(user): File ".../kallithea/lib/auth.py", line 718, in check_permissions return user.has_repository_permission_level(repo_name, self.required_perm) File ".../kallithea/lib/auth.py", line 450, in has_repository_permission_level actual_perm = self.permissions['repositories'].get(repo_name) File ".../kallithea/lib/vcs/utils/lazy.py", line 41, in __get__ value = self._func(obj) File ".../kallithea/lib/auth.py", line 442, in permissions return self.__get_perms(user=self, cache=False) File ".../kallithea/lib/auth.py", line 498, in __get_perms return compute(user_id, user_is_admin) File ".../kallithea/lib/auth.py", line 190, in _cached_perms_data r_k = perm.UserRepoToPerm.repository.repo_name File ".../sqlalchemy/orm/attributes.py", line 285, in __get__ return self.impl.get(instance_state(instance), dict_) File ".../sqlalchemy/orm/attributes.py", line 721, in get value = self.callable_(state, passive) File ".../sqlalchemy/orm/strategies.py", line 710, in _load_for_state % (orm_util.state_str(state), self.key)
sqlalchemy.orm.exc.DetachedInstanceError: Parent instance <UserRepoToPerm at ...> is not bound to a Session; lazy load operation of attribute 'repository' cannot proceed (Background on this error at: http://sqlalche.me/e/bhk3)
Permissions are cached between requests: SA result records are stored in in beaker.cache.sql_cache_short and resued in following requests after the initial session as been removed. References in Permission objects would usually give lazy lookup ... but not outside the original session, where we would get an error like this.
Permissions are indeed implemented/used incorrectly. That might explain a part of the problem. Even if not fully explaining or fixing this problem, it is still worth fixing:
Permissions are fetched from the database using Session().query with multiple class/table names (joined together in way that happens to match the references specified in the table definitions) - including Repository. The results are thus "structs" with selected objects. If repositories always were retrieved using this selected repository, everything would be fine. In some places, this was what we did.
But in some places, the code happened to do what was more intuitive: just use .repository and rely on "lazy" resolving. SA was not aware that this one already was present in the result struct, and would try to fetch it again. Best case, that could be inefficient. Worst case, it would fail as we see here.
Fix this by only querying from one table but use the "joinedload" option to also fetch other referenced tables in the same select. (This might inefficiently return the main record multiple times ... but that was already the case with the previous approach.)
This change is thus doing multiple things with circular dependencies that can't be split up in minor parts without taking detours:
The existing repository join like: .join((Repository, UserGroupRepoToPerm.repository_id == Repository.repo_id)) is thus replaced by: .options(joinedload(UserGroupRepoToPerm.repository))
Since we only are doing Session.query() on one table, the results will be of that type instead of "structs" with multiple objects. If only querying for UserRepoToPerm this means: - perm.UserRepoToPerm.repository becomes perm.repository - perm.Permission.permission_name looked at the explicitly queried Permission in the result struct - instead it should look in the the dereferenced repository as perm.permission.permission_name
scripts/i18n: also normalize casing of UTF-8 in Content-Type
f626260a376c introduced invariant msgmerge casing. Do the same when normalizing to ensure consistency also without msgmerge and to avoid unnecessary conflicts.
scripts/i18n: introduce --merge-pot-file to control normalization
There are actually *two* kinds of normalization:
- in main branches, where we just want the translations - not any trivially derived information or temporary or unstructured data. - in i18n branches, where we want the trivially derived information, and also want to preserve any other information there might be in the .po files.
If no pot file is specifed, do it as on the main branches and strip everything but actual translations. This mode will primarily be used when grafting or rebasing changes from i18n branches.
When a pot file is specified, run GNU msgmerge with it on the po files. The pot file should ideally be fully updated (as done by extract_messages). That will establish a common baseline, leaving only the essential changes as needing merge.
If merging from default branches to 18n, it is better to skip .po and .pot in first 'hg merge' pass, while resolving everything else. Then, with the uncommitted merge, run 'extract_messages', and then merge the .po files using --merge-pot-file kallithea/i18n/kallithea.pot .
(Actually, these two different modes could perhaps be auto detected ...)
scripts/i18n: add command 'normalized-merge' for use with Mercurial's 'merge-tool' option
Add a 'normalized-merge' command to scripts/i18n that will first normalize the i18n files contributing to the merge, then perform a standard merge. If that merge fails (e.g. due to real conflicts) the normalized files are left behind, and the user needs to run another merge tool manually and resolve the merge of these.
Use by putting following snippets in your .hgrc file:
The translation files in the Kallithea repository contained references to the location(s) of each string in the repository. This is useful to translators, but is not needed for all other users.
The big problem with that information is that it changes very commonly as a result of normal development in Kallithea, causing a lot of unimportant delta in the Kallithea repository, thus causing unnecessary repository growth.
In this commit, a basic version of the script is added, only containing the code to normalize the translation files by removing generated and outdated data.
This can be used to check or ensure internal consistency between code and translations, by extracting and merging and then removing most of it again with normalize-po-files:
./setup.py extract_messages
for po in kallithea/i18n/*/LC_MESSAGES/kallithea.po; do msgmerge --width=76 --backup=none --previous --update $po kallithea/i18n/kallithea.pot ; done
scripts/i18n: introduce new i18n maintenance script
The translation files in the Kallithea repository contained references to the location(s) of each string in the repository. This is useful to translators, but is not needed for all other users.
The big problem with that information is that it changes very commonly as a result of normal development in Kallithea, causing a lot of unimportant delta in the Kallithea repository, thus causing unnecessary repository growth.
A script 'i18n' is added to help maintain the i18n files. Functionality will be added later.
Traceback (most recent call last): File "scripts/update-copyrights.py", line 45, in <module> from . import contributor_data ImportError: attempted relative import with no known parent package
Fixed by backing out changeset 2786730e56e0. The pytype problem can be solved differently.
user: make get_by_username_or_email default to treat username case insensitive
The get_by_username_or_email is a flexible function, intended to find users in multiple ways, suitable for login prompts. The function was sometimes used with case sensitive user lookup, sometimes without. Instead, be consistent and just default to be insensitive.
auth: show a clear "Authentication failed" message if login fails after passing form validation
log_in_user will only set a session cookie after verifying that the user is valid (for example based on IP). The code is thus safe, but no hint were given to the user if login failed for that reason.
login: assert that the validated user actually is found
Due to another bug, it was possible that authentication succeeded but the user object couldn't be obtained. This was for example noticed when the LDAP auth module did not correctly parse the email attribute, and a login via email was attempted. In this case, the user was retrieved from email address and LDAP found the user, but the email attribute in the Kallithea database was then changed incorrectly and a subsequent retrieval based on the same original email address would not find the user.
Such problem would lead to an assert in Kallithea:
File ".../kallithea/controllers/login.py", line 104, in index auth_user = log_in_user(user, c.form_result['remember'], is_external_auth=False, ip_addr=request.ip_addr) File ".../kallithea/lib/base.py", line 122, in log_in_user assert not user.is_default_user, user AttributeError: 'NoneType' object has no attribute 'is_default_user'
This assert cought the problem but is not a spot-on indicator of the real problem. Instead, we can catch this problem sooner by adding an assert already in the login controller.
Viewing the two-way diff of an added file gives following exception:
File "_base_root_html", line 211, in render_body
File "_base_base_html", line 42, in render_body
File "files_diff_2way_html", line 197, in render_main
File ".../kallithea/lib/vcs/nodes.py", line 411, in is_binary return b'\0' in self.content TypeError: 'in <string>' requires string as left operand, not bytes
At this point, self.content was '' (empty string).
Due to a missing conversion from bytes to unicode for the attribute values obtained from LDAP, storing the values in a unicode field in the database would fail. It would apparently either store a repr of the bytes or store them in some other way.
Upon user login, SQLAlchemy warned about this:
.../sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value b'John'. (this warning may be suppressed after 10 occurrences) .../sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value b'Doe'. (this warning may be suppressed after 10 occurrences)
In PostgreSQL, this would result in 'weird' values for first name, last name, and email fields, both in the database and the web UI, e.g. firstname: \x4a6f686e lastname: \x446f65 email: \x6a6f686e406578616d706c652e636f6d These values represent the actual values in hexadecimal, e.g. \x4a6f686e = 0x4a 0x6f 0x68 0x6e = J o h n
In SQLite, the problem initially shows differently, as an exception in gravatar_url():
File "_base_root_html", line 207, in render_body
File "_index_html", line 78, in render_header_menu
File "_base_base_html", line 479, in render_menu
File ".../kallithea/lib/helpers.py", line 908, in gravatar_div gravatar(email_address, cls=cls, size=size))) File ".../kallithea/lib/helpers.py", line 923, in gravatar src = gravatar_url(email_address, size * 2) File ".../kallithea/lib/helpers.py", line 956, in gravatar_url .replace('{email}', email_address) \ TypeError: replace() argument 2 must be str, not bytes
but nevertheless the root cause of the problem is the same.
Fix the problem by converting the LDAP attributes from bytes to strings.
hg: handle Mercurial RepoError correctly in is_valid_repo_uri
RepoError would be leaked by is_valid_repo_uri.
Now, when for example validating an ssh URL without having the ssh client binary, it will be shown/logged as: remote: /bin/sh: ssh: command not found 2020-03-17 17:28:53.907 WARNI [kallithea.model.validators] validation of clone URL 'ssh://no-ssh.com/' failed: Mercurial RepoError: no suitable response from remote hg and shown in the UI as 'Invalid repository URL'.