Skip to content. | Skip to navigation

Personal tools
Log in
Sections
You are here: Home

Blog

Super Environment

Posted by Manuel Reinhardt at Nov 16, 2017 03:50 PM |
Filed under: ,

If you're using supervisor a lot then you may know that you can pass environment variables to subprocesses. Ideally you would also know that if you don't do that then the environment of the shell that was used to start supervisor is inherited (see Supervisor Documentation - Subprocess Environment). But in the heat of battle you may forget that it makes a lot of difference whether this starting shell is something like an interactive bash session or a cron job.

In one of our projects we had applied a workaround that relied on setting the PATH environment variable so that a specific version of a tool was used. The PATH was set in the bashrc, so when starting supervisor manually from bash it worked fine. But at some point the machine was restarted and an @reboot cron job started the supervisor. And suddenly a different version of the tool was used...

Against The Flow

Posted by Manuel Reinhardt at Jun 17, 2016 10:45 AM |
Filed under: , ,

In our company we're using git a lot. We've started practising Git Flow for some of our larger projects. Recently, an article about GitLab Flow (with comparisons to Git Flow and GitHub Flow) got me thinking whether Git Flow is the right flow for us. Two problems with Git Flow that the article mentions immediately struck a nerve with me:

* it's quite complicated
* it deviates from the convention that master is the default branch

The second issue is easily addressed by simply using a different name for the branch that holds the release tags (say, "production"). The Git Flow extension even lets you configure that ("git flow init").

The first one is not so easily remedied, though. You have to remember to merge feature branches to develop, hotfix branches to master and develop, etc. It gets easier if you're using the Git Flow extension, but the commit history is still full of merge commits that don't add much information. Also, if you're using a release helper like zest.releaser in addition, it adds more steps to the process which can create additional confusion. And, even with the Git Flow extension, mistakes happen and can be hard to resolve.

GitHub Flow is on the other end of the spectrum in terms of complexity. It boils down to doing development in feature branches and merging them back to master (after review).

GitLab Flow occupies the middle ground. It defines a production branch that reflects deployable code. When you want to make a release you merge master into production. Nothing else is done on "production" and it is never merged to any other branch.

You can add optional branches that hold code deployed to separate environments (demo, staging, etc.). There's also a concept of release branches but that doesn't seem relevant to us.

The second half of the article (starting from "Merge/pull requests with GitLab flow") is mostly concerned with integration of the git workflow into issue tracking. That's an interesting topic as well but I'll stick to the actual git workflow for now.

To decide whether one of these flows is appropriate for us we need to ask, of course, what we need vs. what they offer. The basic idea is of course to get to a deployable state in a quick and sane fashion. Feature branches probably make sense for all our projects to keep things separate until they are ready to share. So, do we need more than that?

All of the mentioned workflows seem to assume that continuous deployment is practised, or at least that code is deployed via git checkout. I remember learning, ages ago, never to deploy from version control. But times change and it seems to be quite common by now to do just that. In fact, part of the intention of the discussed workflows seems to be to define a safe way of deploying directly from git. In our company we haven't adopted that approach yet, but we might think about doing so for some projects, at least for the code that never gets released to the public anyway. It would of course eliminate the overhead of creating eggs.

If we stick to creating eggs for now, though, I don't see much benefit of having a separate branch only for deployable versions, be it called "master" or "production" or whatever. This seems to suggest using GitHub Flow.

However, we sometimes have the situation that we have long-running branches for bigger development packages that the customer wants deployed in one go, not bit by bit. At the same time we will still have small help desk developments or bug fixes in parallel using more frequent (e.g., weekly) deployments. This is addressed by Git Flow by having a dedicated "develop" branch where long-running development can happen and that receives any helpdesk/bugfix commits which happen in the hotfix branches. However, this is where a lot of the complexity comes in.

Can we simplify this a little? The need to merge hotfix branches into develop in addition to master seems to arise from the fact that the develop branch is always continued. Alternatively we could say that we create dedicated branches with a finite lifetime for every development package. E.g. in one of our projects we usually have quite some time in between each development phase, so it doesn't make much sense to keep the develop branch open. Instead we would create a named branch ("p8", "phase2", ...) off master, and once the package is finished we merge back to master and delete that branch. If anything else has happened on master in the meantime it will be in the release because the release is created from master. The next development branch will be branched off of master again, so it will have those commits as well.

Or we could even always use "develop" as the development branch name. There's no problem with merging develop to master and deleting it, and then later creating a new branch that's called "develop" again. Of course this doesn't work if there are multiple development packages in progress at the same time, but that's probably not a good idea anyway. The advantage of reusing the branch name is obvious - you always know what the current development branch is called.

Maybe this could become the basis of a simplified Git Flow for projects where we need it: Regular smaller development (like help desk and bug fixes) on master, larger packages on named branches (could always be named "develop"); plus feature branches that branch off and merge back to either. Some convention may be helpful to mark which branch merges back to master and which to the development branch.

I feel we should also talk about the review process when merging a feature branch, but before I get carried away any more I'll stop here.

There's no doubt that there is no such thing as the universally best workflow. That's why I mentioned looking at our different projects - different projects may call for different workflows. However, my hope is that we could find or come up with a modular workflow (like GitLab Flow) and simply ignore some optional parts for our simpler projects while taking advantage of the more advanced workflow features for projects that need a little more structure.

The Mime Type Magic Show

Posted by Manuel Reinhardt at Aug 08, 2015 02:40 PM |

"Articles of Association" seems like a perfectly fine description for a file object. However, trying to set this string as the description using a custom form failed for me in a Plone instance. After saving, the description returned an empty string. Other strings ("123", "My ridiculous test description", etc.) were saved just fined. After a while I could narrow it down to the string "Article". If the description started with this string and had at least one following character, we ended up with an empty string.

A descent into the depths of Archetypes and Plone core finally revealed the reason: No mime type was supplied for the description, so the MimetypesRegistry tried to guess it from the beginning of the string. There are a number of hard coded "magic numbers" in the magic module, which aren't actually always numbers but sometimes strings, including "Article". If the beginning of the description matched one of these strings, a mime type other than text/plain was guessed, and things went south from there. Without a match, "text/plain" was assumed and all was jolly.

The problem only surfaced after years in production. This is understandable as most of the "magic" strings would almost never appear at the beginning of a description string ("MM\x00\x2a", "<xbel", etc.). However, some of them very occasionally might happen to make it there ("Article", "Only in ", "import ", etc.).

The solution was to explicitly specify a mime type in the form with a hidden field.

<input type="hidden" name="description_text_format" value="text/plain" />

This is read by Archetypes and prevents any guessing.

To me, this problem also begs the question, "How smart should software try to be?". I'd love to see a piece of software that is smart enough to "understand" input without being explicitly programmed to do so. However, we're a long way from there, and often enough attempts at writing "magic" functions that try to be clever lead to unexpected problems that are a pain to debug. Maybe the problem is connected to the programming principles of Single Responsibility and Modularity. As a human developer I see the big picture and it's obvious to me that a description is always a plain text string. The MimetypesRegistry only looks at the string that is passed in, without any knowledge of what it is used for. If it saw (and understood) the big picture it might come to the same conclusion as the human developer. But as I said, we're a long way from there.

All scripts in the basket

Posted by Manuel Reinhardt at Mar 20, 2015 02:59 PM |

In a buildout that has several sections using zc.recipe.egg I had the problem that a script I was building in one section ended up with the initialization of another script in another section.

Let's assume we have an egg "main_egg_a" that defines a script "script_a" and "common_egg" defines a script "common_script" and a couple of other scripts. The way we want to use script_a requires us to also include "common_egg" in the respective section. In the other section we want only "common_script", skipping any other scripts declared by "common_egg", so we declare it with the "scripts" option.

[script-a]
recipe = zc.recipe.egg
eggs =
main_egg_a
common_egg
initialization =
import os
os.environ["foo"] = "emerald"

[script-b]
recipe = zc.recipe.egg
eggs =
common_egg
scripts = common_script
initialization =
import os
os.environ["foo"] = "zirconia"

With this configuration however, bin/common_script kept having "emerald" set in the environment variable.

The problem is that the section [script-a] does not have a "scripts" option, but also includes common_egg. Without "scripts" declared it builds all the scripts in the given eggs, here script_a and common_script. If both sections build bin/common_script, the result depends on the order of execution. In my setup the script-b section always ran first, building bin/common_script with "zirconia" as the value for "foo". Then script-a ran afterwards, overwriting the previously built bin/common_script using its own setting for "foo", "emerald".

The fix is simple once you've figured out what's going on: Declare "scripts" if you give multiple eggs to the recipe.

[script-a]
recipe = zc.recipe.egg
eggs =
main_egg_a
common_egg
scripts = script_a
initialization =
import os
os.environ["foo"] = "emerald"

[script-b]
recipe = zc.recipe.egg
eggs =
common_egg
scripts = common_script
initialization =
import os
os.environ["foo"] = "zirconia"

Reversion Control

Posted by Manuel Reinhardt at Nov 03, 2014 04:50 PM |

Assuming you're using git for version control, let's say you've merged a feature branch into the master branch of some repository. Then you decide that the feature is not quite ready yet after all, and you revert the merge. Later on, after a few more modifications, you merge the feature branch into master again. You could think that now you have all the changes from the feature branch in master. But that's probably not correct. The man page for git revert says:

"Reverting a merge commit declares that you will never want the tree changes brought in by the merge. As a result, later merges will only bring in tree changes introduced by commits that are not ancestors of the previously reverted merge. This may or may not be what you want."

(Git 1.9.1 03/19/2014 GIT-REVERT(1))

In a situation like this I was able to bring back the changes by reverting the revert commit - no guarantees for this, though, as I'm not yet aware of any recommendation or best practice for this.

Stay On The Path

Posted by Manuel Reinhardt at Jun 14, 2014 01:03 PM |
Consider a Plone browser view like this:
from Acquisition import aq_inner
from zope.component import getUtility
from zope.intid.interfaces import IIntIds

class SomeView(grok.View)
    grok.name('some-view')

    def render(self):
        doc = aq_inner(self.context)
        intids = getUtility(IIntIds)
        to_id = intids.getId(doc)
        [...]
        
There's nothing wrong with this code, yet we got a KeyError on the getId call. After a while we figured out that this only happens when virtual hosting is configured to hide the portal name (e.g. http://somehost/ proxies to http://localhost:8080/Plone) AND the URL contains the portal name anyway (http://somehost/Plone/... - this happens if you use getPhysicalPath() instead of absolute_url()). In this case the Acquisition chain contains the portal twice and aq_iter (from five.intid.utils) wrongly detects a __parent__ loop. Removing the protal name from the URL (by using absolute_url()) solved the issue.

What's the user

Posted by Manuel Reinhardt at May 15, 2014 11:50 AM |

If you want to check for a permission on a user other than the one who is currently logged in, this will not do what you'd expect:

> user.checkPermission(ModifyPortalContent, self.context)

This acquires the checkPermission() method from the MembershipTool and actually checks the permission on the currently logged-in user (authenticated member), ignoring the user object. In theory, this should do the trick:

> user.has_permission(ModifyPortalContent, self.context)

However, at least in my code, this returns False because Acquisition claims that the user object and self.context are not in the same acquisition context. What finally worked for me was:

> from plone import api
> with api.env.adopt_user(user=user):
>    user.checkPermission(ModifyPortalContent, self.context)

This temporarily switches the security context to user, which is then used for permission checks. If you're not using plone.api, you'll do something like

> old_security_manager = getSecurityManager()
> newSecurityManager(getRequest(), user)
> user.checkPermission(ModifyPortalContent, self.context)
> setSecurityManager(old_security_manager)

Run, xul, run

Posted by Manuel Reinhardt at Sep 12, 2012 05:10 PM |

My eclipse/birt installation has been giving me a headache lately, spontaneously crashing with some reports when I tried to open the preview or the report viewer. With the latter I got the error message

No more handles [Could not detect registered XULRunner to use]

Investigating, I found that ubuntu does not supply a xulrunner package any more. I tried getting xulrunner from mozilla, but eclipse does not support the latest xulrunner versions. Of course, older versions only come in 32 bit flavour. I tried a 32 bit version of eclipse but didn't get it working (later I discovered that was because I set the XULRunnerPath vmarg incorrectly - it mostly works now, except for lots of "wrong ELF class" messages).

So I grabbed the xulrunner 1.9.2.19 source, checked the XULRunner tutorial, and after much cursing managed to compile the bugger.

In addition to the prerequisites mentioned in the tutorial, I had to install some more packages:

sudo apt-get install libxt-dev libidl-dev gcc-4.4 g++-4.4

I set the following environment variables:

CC=/usr/bin/gcc-4.4
CXX=/usr/bin/g++-4.4
JAVA_HOME=/usr/lib/jvm/java-6-sun/

Also, at some point make complained about a missing file curl/types.h, referenced from toolkit/crashreporter/google-breakpad/src/common/linux/http_upload.cc. It turned out that curl/types.h has become obsolete, so I just deleted the line

#include <curl/types.h>

from http_upload.cc. Another missing file was Linux3.0.mk. I crossed my fingers and tried

cp security/coreconf/Linux2.6.mk security/coreconf/Linux3.0.mk

...and it worked!

This is my .mozconfig:

mk_add_options MOZ_CO_PROJECT=xulrunner
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@
mk_add_options MOZ_MAKE_FLAGS="-j4"
ac_add_options --enable-application=xulrunner
ac_add_options --disable-tests

and this an excerpt from my eclipse.ini:

[...]
-vmargs -Dosgi.requiredJavaVersion=1.5 -Dhelp.lucene.tokenizer=standard -XX:MaxPermSize=256m -Dorg.eclipse.swt.browser.XULRunnerPath=/opt/mozilla-1.9.2/obj-x86_64-unknown-linux-gnu/dist/bin -Xms40m -Xmx512m

(I only added the -Dorg.eclipse.swt.browser.XULRunnerPath line)

I'm using eclipse-reporting-juno-linux-gtk-x86_64 on ubuntu precise (12.04.1), and apart from the "Preview" tab it works fine at the moment.

A new property for groups

Posted by Manuel Reinhardt at Feb 20, 2012 08:44 PM |
This took me a while to puzzle out: If you want your user groups in Plone to have a custom property (in my case an "alias" of type string), you create a PAS property plugin that implements IPropertiesPlugin.
from Products.PluggableAuthService.interfaces.plugins import IPropertiesPlugin
from Products.PluggableAuthService.plugins.BasePlugin import BasePlugin

class GroupAliasPlugin(BasePlugin):
""" Provides the property 'alias' for groups """
    meta_type = 'Group Alias Plugin'
    implements(IPropertiesPlugin)
Basically you only need to implement the method getPropertiesForUser. To make the property editable, it should return a MutablePropertySheet instance and you also should implement setPropertiesForUser.
from Products.PlonePAS.sheet import MutablePropertySheet
...
    def getPropertiesForUser(self, user, request=None):
        ...
        return MutablePropertySheet(self.id, schema=[('alias', 'string'),], **data)

    def setPropertiesForUser(self, user, propertysheet):
        ...
Now we need to make out plugin addable via the ZMI:
from App.special_dtml import DTMLFile

def manage_addGroupAliasPlugin(self, id, title='',
                               RESPONSE=None, schema=None, **kw):
    """ Add a Group Alias Plugin. Comes with custard or ice cream """
    o = GroupAliasPlugin(id, title, schema, **kw)
    self._setObject(o.getId(), o)

    if RESPONSE is not None:
        RESPONSE.redirect('manage_workspace')

manage_addGroupAliasPluginForm = DTMLFile('zmi/GroupAliasPluginForm', 
                                          globals(),
                                          __name__='manage_addGroupAliasPluginForm')
And in __init__.py:
from AccessControl.Permissions import add_user_folders
from Products.PluggableAuthService import registerMultiPlugin
from ... import groupalias

registerMultiPlugin(groupalias.GroupAliasPlugin.meta_type)

def initialize(context):
    """Initializer called when used as a Zope 2 product."""
    context.registerClass(groupalias.GroupAliasPlugin,
                         permission=add_user_folders,
                         constructors=(groupalias.manage_addGroupAliasPluginForm,
                                       groupalias.manage_addGroupAliasPlugin),
                         visibility=None)
We're almost there. To use the plugin, you have to add it to acl_users and activate it, AND you have to tell portal_groupdata that it should handle an additional property. If you want it active by default, you can do it in setuphandlers.py like this:
def addGroupProperties(context):
    """ Add a GroupDataPlugin to acl_users and add property to
        portal_groupdata
    """
    site = context.getSite()

    # most of this was borrowed from pas.plugins.ldap, thanks!
    pluginid = 'groupalias'
    pas = site.acl_users
    installed = pas.objectIds()
    if pluginid in installed:
        log.info("%s already installed." % pluginid)
    else:
        plugin = GroupAliasPlugin(pluginid, title="Group Alias Plugin")
        pas._setObject(pluginid, plugin)
        plugin = pas[plugin.getId()] # get plugin acquisition wrapped!
        for info in pas.plugins.listPluginTypeInfo():
            interface = info['interface']
            if not interface.providedBy(plugin):
                continue
            pas.plugins.activatePlugin(interface, plugin.getId())
            pas.plugins.movePluginsDown(
                interface,
                [x[0] for x in pas.plugins.listPlugins(interface)[:-1]],
            )

    # Our PAS property plugin already provides additional properties,
    # but we need to let portal_groupdata know about them.
    gd_tool = getToolByName(site, 'portal_groupdata')
    if gd_tool.hasProperty('alias'):
        gd_tool._delProperty('alias')
    gd_tool._setProperty('alias', '', 'string')

Lost library

Posted by Manuel Reinhardt at Oct 25, 2011 05:28 PM |
When trying to start IPython in a virtualenv I kept getting the error
ImportError: No module named _sqlite3
I was using a python executable that I compiled myself, and it took me some time to realize that the compile process could find the sqlite header files but not the library itself. It turned out that on this machine (Ubuntu 11.04) the file libsqlite3.so is in /usr/lib/x86_64-linux-gnu, but python expected it in /usr/lib. Creating a softlink did the trick.

Smaller Blocks

Posted by Manuel Reinhardt at Apr 27, 2011 08:05 PM |

A post in the gentoo forums just helped me realize that blocks with a leading "<", e.g.

[blocks B     ] <dev-libs/libxml2-2.7.7 ("<dev-libs/libxml2-2.7.7" is blocking sys-libs/zlib-1.2.4) 

are quite easy to resolve by first upgrading the blocking package to a version newer than the specified one (2.7.7 in this case) and then proceeding with the original merge.

reinhardt@floyd ~ $ emerge -av1 libxml2 && emerge --update world

Writing this post helped me realize that escaping "<" can be important.

Exiftentialism

Posted by Manuel Reinhardt at Aug 22, 2010 01:09 PM |
Today I tried to look up when exactly I had taken some photographs and discovered I had missed to set the EXIF use flag on my Eee PC. Consequently GIMP and Eye of Gnome had been built without EXIF support. I set the use flag and remerged, discovering in the process that I had not fully grasped the use of --newuse (more later, maybe). A short while later I was happy to find the EXIF data inside the EoG image properties window. But then I found a few pictures without that data. I browsed back and forth a little and found that all the photos in portrait format were missing EXIF. And of course, they would be. I had rotated and re-saved them on this very system with the EXIF unaware Eye of Gnome. Bummer.

Share what you open

Posted by Manuel Reinhardt at Jun 08, 2010 11:31 AM |
I often work remotely on a server with two or more shells running at once. Lately the thought that one ssh connection should be enough for a couple of shells got stuck in my mind. I checked the man page and discovered the ControlPath and ControlMaster parameters. The former specifies the location of a control socket that is used for a shared connection, while the latter determines if the current ssh process should manage the shared connection or just connect to the socket as a slave. Adding the lines
ControlPath ~/.ssh/ssh-%r@%h:%p.sock
ControlMaster auto
to your ssh config file (usually ~/.ssh/config) causes a starting ssh process to check the specified location for a socket and, if it exists, connect to it, or if it doesn't, create a new shared connection including the socket. NB. If your connection (shared or not) gets stuck when you leave it idle for a while, it could be due to firewall restrictions. Check the ServerAliveInterval parameter.

Old-Fashioned Alice

Posted by Manuel Reinhardt at Jun 03, 2010 01:25 PM |
After setting up internet in our new flat, I wondered why there often was a delay of almost half a minute between entering a web address and firefox starting to load the data. I found out that firefox sent IPv6 DNS queries (type AAAA), which our Alice-DSL router didn't understand. After 4 tries had timed out, it finally sent an IPv4 query (type A) that succeeded. The solution with the best cost-benefit ratio seemed to be setting
network.dns.disableIPv6 true
in firefox's about:config. Strangely, though, I found an address of an Alice DNS server (213.191.74.12 or dnsp03.hansenet.de) that didn't show these problems when setting it manually in /etc/resolv.conf

Convert multiple lines to comma separated list

Posted by Manuel Reinhardt at Jun 01, 2010 11:16 AM |
To concatenate multiple lines of text and separate them by commas, I use this sed script:
sed -n '$!{s/.*/\0,/;H};${H;x;s/\n//gp}'
This way you can e.g. use the output of a ps command as an argument for top to only display java and python processes:
top -p `ps -C python,java -o pid --no-headers | sed -n '$!{s/.*/\0,/;H};${H;x;s/\n//gp}'`
Thanks goes to bakunin, whose script in a thread at unix.com got me started.

Open Office and Poppler

Posted by Manuel Reinhardt at May 16, 2010 11:40 PM |
Trying to update Open Office on my Eee 1005H running gentoo from 3.1.1 to 3.2.0, I ran into some blocks caused by poppler and related packages. I discovered I had to unmerge the following packages first:
dev-libs/poppler
dev-libs/poppler-glib
app-text/poppler-utils
virtual/poppler
virtual/poppler-glib
virtual/poppler-utils
Emerge is still running at the moment, but looks like all's going smoothly now.