<?xml version='1.0' encoding='UTF-8'?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
  <id>https://aaltoscicomp.github.io/blog/</id>
  <title>Aalto Scientific Computing Blog</title>
  <updated>2026-05-20T19:59:42.621025+00:00</updated>
  <link href="https://aaltoscicomp.github.io/blog/"/>
  <link href="https://aaltoscicomp.github.io/blog/blog/atom.xml" rel="self"/>
  <generator uri="https://ablog.readthedocs.io/" version="0.11.13">ABlog</generator>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2026/what-is-ai/</id>
    <title>What is AI? Everything, everywhere, all at once</title>
    <updated>2026-04-27T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="what-is-ai-everything-everywhere-all-at-once"&gt;

&lt;p&gt;As Research Software Engineers (RSEs) in a university, we get lots of
AI questions and projects. But what is “AI” exactly? It can mean
almost anything, and it makes a bit of a problem when we and our
customers may not be starting from a shared understanding of what “AI”
is.  Let’s discuss.&lt;/p&gt;
&lt;p&gt;For us research engineers, a “real AI” project might mean “writing
Python code and optimizing deep learning training using a large
computer cluster” or “does a Mixture of Experts-model produce better
accuracy compared to a traditional feed-forward network in my use
case?”.  In practice, “AI” questions to us have included:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Problems with web APIs&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Problems with PyTorch&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Problems with HPC libraries&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Problems with laws and regulations&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Problems with web servers&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Problems with Python installations&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I think the effect of “AI” is to reduce the effort needed to use
computing tools, so the amount of computing people want to do increases.
These increases are proportional to all the usual computing projects we
get, even if the projects aren’t exactly deep learning training. Thus
the increased need for our team and a general increase in computing
and data literacy.&lt;/p&gt;
&lt;p&gt;In the end, “AI” is such an overloaded term that it can mean anything.
To better ask for help with “AI”, it’s good to be able to decipher it
down to the actual topic.  This post isn’t about scientific computing
general, so let’s look deeper below at what the core “AI” use cases
are:&lt;/p&gt;
&lt;section id="types-of-ai"&gt;
&lt;h2&gt;Types of AI&lt;/h2&gt;
&lt;p&gt;This blog post evolved out of a colleague’s talk at &lt;a class="reference external" href="https://aaltoscicomp.github.io/NoBSC/"&gt;NoBSC 2026&lt;/a&gt;, where they pointed the
issue that “AI” can mean anything, thus it isn’t a good to speak of
“AI” projects without further clarification.  In that talk, we thought
of two broad categories for the actual uses of “AI”:&lt;/p&gt;
&lt;p&gt;“AI” can mean pattern matching and decision making. In this, you have
some input data and predict some output based on that.&lt;/p&gt;
&lt;p&gt;“AI” can mean content generation, as in generating text, images, or
more.  This is, in fact, also a prediction of an output based on an
input prompt.&lt;/p&gt;
&lt;p&gt;You notice these two categories are pretty normal things? That’s because
they aren’t, it’s just that modern machine learning (deep learning) has
gotten so much better at it than even a decade ago.  It’s not
intelligence, it’s predictions that seem human-like or better.&lt;/p&gt;
&lt;p&gt;These categories aren’t scientific - it’s just what we use as a base
for discussion with our customers.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="pattern-matching-and-decision-making"&gt;
&lt;h2&gt;Pattern matching and decision making&lt;/h2&gt;
&lt;p&gt;This is a traditional use of machine learning. Here, you take input
data and predict some output. This is usually done by having a lot of
sample input data and corresponding “true” outputs for that data, and
training a model. When you give the model new input data, it can
predict an output. This is known as “supervised learning”.&lt;/p&gt;
&lt;p&gt;This needs specific training data and the output model is usually
specific to that domain or model.  Deep learning can certainly find
patterns that humans or traditional machine learning can’t - assuming
such patterns exist in the first place.&lt;/p&gt;
&lt;p&gt;For an advanced example, an industrial plant has sensors monitoring the
whole process and records of each time it broke down. By using this
data, “AI” might be able to predict breakdowns more accurately that a
human or non-deep machine learning tools could. These types of uses are
relatively un-objectionable to society.&lt;/p&gt;
&lt;p&gt;Other examples include things such as insurance companies using all of
their data to analyze claims and preemptively deny coverage to those who
think are more likely to get sick. Or using pattern matching to
approve/deny claims without a human taking responsibility. These have
much more societal impact and thus lead to lots of suspicion about “AI”,
since it’s being used to diffuse responsibility.&lt;/p&gt;
&lt;p&gt;Another type of pattern matching is classifying things without having
any true labels. This is called “unsupervised learning”. One example
would be the “Netflix Challenge” where scientists tried to use watching
data to predict what would be relevant recommendations. It doesn’t
matter what the detected categories are, just that things go together.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="content-generation"&gt;
&lt;h2&gt;Content generation&lt;/h2&gt;
&lt;p&gt;Just like the section title says, this is generating content. Examples
could be chatbots (generating text) or image generators based on some
input. This has become so widespread since 2022 that it’s easy to
think that this is “AI”.&lt;/p&gt;
&lt;p&gt;Under the hood, this is actually pattern matching, since it takes a
prompt, uses all the previous input data, and generates a predicted
output. There isn’t actually “intelligence” under the hood, and it’s
all limited by the power of the algorithms and the source data. It can
be wrong, not useful, etc.  Content generation is definitely &lt;em&gt;not&lt;/em&gt; at
human-level, and does &lt;em&gt;not&lt;/em&gt; have the wide background and task
knowledge of a human.  It is much faster, though.&lt;/p&gt;
&lt;p&gt;Some people can use content-generating methods to make predictions,
which isn’t as refined as actual pattern-matching/decision-making
method, but because of the general-purpose nature of content
generation, it can work without much effort.  One example I’ve heard
of is using large language models (LLMs) with an input such as “is
this social media post positive or negative sentiment? Answer with one
word ‘positive’ or ‘negative’”.  You get a sentiment analyzer with
very little work, that has some large implicit background knowledge.
This is the power of these so-called “foundation models” that can do
many tasks.  The downside is it uses much more computing resources and
more chance of going off the rails (hallucination, implicit biases of
input data, etc.).&lt;/p&gt;
&lt;p&gt;Content generation is powerful, but has potential for misinformation or
misuse on a massive scale. One probably wants to be careful when using
content generation for predictive tasks.  The prevalence of generated
content and its potential for misuse is behind a lot of the backlash
to “AI”.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="commercial-platforms"&gt;
&lt;h2&gt;Commercial platforms&lt;/h2&gt;
&lt;p&gt;While not a category, there are also commercial platforms that do the
above things. They are set up to be easy to use by a broad
audience. We can help with these things, but most of the actual “AI”
work is already done.  ChatGPT’s early dominance in “AI” was probably
caused almost as much by figuring out a useful, usable interface for
the general public as their underlying “AI” technology.&lt;/p&gt;
&lt;p&gt;So, while commercial platforms and purchasable tools may be easy, they
usually have very limited information about how they work (hidden
behind “AI” to make you think it’s intelligent), and similar things
can be done locally given enough time and effort.  The delegation of
accountability (and thinking) to third-party platforms is also behind
part of the backlash to “AI”.&lt;/p&gt;
&lt;p&gt;Examples include the chatbots that everyone uses, coding assistants,
text summarizers, etc.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-you-should-do"&gt;
&lt;h2&gt;What you should do&lt;/h2&gt;
&lt;p&gt;If you, or someone, has an “AI” project, the first step is to think
deeper and figure what is really the goal.  rkdarst has an old saying,
“explain it to me again without any terms invented or made popular in
the last ten years”.  This helps to peel back these layers and get a
description of what is actually needed, and is probably quite useful
when trying to figure out what an “AI” project actually is.  If you can
explain your project without saying “AI”, you are well on the way to
solving it with “AI”.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Also, we shouldn’t separate “AI” from computing skills in
general. “AI” lets people do more computing with less work, but it
means everyone needs a higher base level of knowledge (even about
no-exactly-“AI” topics).&lt;/strong&gt; Think back to when cars became cheaper and
more reliable. Once cars became more common, people didn’t have to
know as much about their intervals, but many more people had to learn
how to drive and interact with them.  (We’re not saying we want our
cities infested with cars, nor AI).  This is true even when the field
of study isn’t “computing”.  &lt;strong&gt;Don’t let “AI literacy” become “ChatGPT
literacy”, it is literacy in computing, data, and problem solving -
and a healthy suspicion for details hidden behind jargon.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You certainly can benefit from “AI” without knowing all the details.
It’s very hard to benefit from it without knowing your actual goal.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: this blog post was written with zero “AI” content generation.
My anonymous colleague has contributed some of the key ideas and
title.&lt;/em&gt;&lt;/p&gt;
&lt;/section&gt;
&lt;section id="definitions"&gt;
&lt;h2&gt;Definitions&lt;/h2&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Machine learning&lt;/strong&gt;: A field of study using statistical algorithms
to learn from data and generalize to unseen data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deep learning&lt;/strong&gt;: Machine learning using multilayered neural networks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Supervised learning&lt;/strong&gt;: Machine learning methods taking input and
labeled “true” outputs as the training data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unsupervised learning&lt;/strong&gt;: Compared to the above, algorithms
learning from input data which is unlabeled.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Training&lt;/strong&gt;: The process of iterating through the input data to
find patterns, resulting in the trained model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;: The learned parameters from input data and training,
which can be combined with the right code to make predictions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Application programming interface (API)&lt;/strong&gt;: An interface that makes
it easy for a computer code to interact with something (as opposed
to a human-optimized interface).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;“Artificial intelligence”&lt;/strong&gt;: Did you think I’m going to give a single
simple definition here?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="see-also"&gt;
&lt;h2&gt;See also&lt;/h2&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://fundamentals-of-secure-ai-systems-with-personal-data-9cd9e2.pages.code.europa.eu/ch1.html#what-is-artificial-intelligence"&gt;Fundamentals of secure AI systems with personal data -&amp;gt; What is
artificial intelligence?&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2026/what-is-ai/"/>
    <summary>As Research Software Engineers (RSEs) in a university, we get lots of
AI questions and projects. But what is “AI” exactly? It can mean
almost anything, and it makes a bit of a problem when we and our
customers may not be starting from a shared understanding of what “AI”
is.  Let’s discuss.For us research engineers, a “real AI” project might mean “writing
Python code and optimizing deep learning training using a large
computer cluster” or “does a Mixture of Experts-model produce better
accuracy compared to a traditional feed-forward network in my use
case?”.  In practice, “AI” questions to us have included:Problems with web APIsProblems with PyTorchProblems with HPC librariesProblems with laws and regulationsProblems with web serversProblems with Python installationsI think the effect of “AI” is to reduce the effort needed to use
computing tools, so the amount of computing people want to do increases.
These increases are proportional to all the usual computing projects we
get, even if the projects aren’t exactly deep learning training. Thus
the increased need for our team and a general increase in computing
and data literacy.In the end, “AI” is such an overloaded term that it can mean anything.
To better ask for help with “AI”, it’s good to be able to decipher it
down to the actual topic.  This post isn’t about scientific computing
general, so let’s look deeper below at what the core “AI” use cases
are:This blog post evolved out of a colleague’s talk at NoBSC 2026, where they pointed the
issue that “AI” can mean anything, thus it isn’t a good to speak of
“AI” projects without further clarification.  In that talk, we thought
of two broad categories for the actual uses of “AI”:“AI” can mean pattern matching and decision making. In this, you have
some input data and predict some output based on that.“AI” can mean content generation, as in generating text, images, or
more.  This is, in fact, also a prediction of an output based on an
input prompt.You notice these two categories are pretty normal things? That’s because
they aren’t, it’s just that modern machine learning (deep learning) has
gotten so much better at it than even a decade ago.  It’s not
intelligence, it’s predictions that seem human-like or better.These categories aren’t scientific - it’s just what we use as a base
for discussion with our customers.This is a traditional use of machine learning. Here, you take input
data and predict some output. This is usually done by having a lot of
sample input data and corresponding “true” outputs for that data, and
training a model. When you give the model new input data, it can
predict an output. This is known as “supervised learning”.This needs specific training data and the output model is usually
specific to that domain or model.  Deep learning can certainly find
patterns that humans or traditional machine learning can’t - assuming
such patterns exist in the first place.For an advanced example, an industrial plant has sensors monitoring the
whole process and records of each time it broke down. By using this
data, “AI” might be able to predict breakdowns more accurately that a
human or non-deep machine learning tools could. These types of uses are
relatively un-objectionable to society.Other examples include things such as insurance companies using all of
their data to analyze claims and preemptively deny coverage to those who
think are more likely to get sick. Or using pattern matching to
approve/deny claims without a human taking responsibility. These have
much more societal impact and thus lead to lots of suspicion about “AI”,
since it’s being used to diffuse responsibility.Another type of pattern matching is classifying things without having
any true labels. This is called “unsupervised learning”. One example
would be the “Netflix Challenge” where scientists tried to use watching
data to predict what would be relevant recommendations. It doesn’t
matter what the detected categories are, just that things go together.Just like the section title says, this is generating content. Examples
could be chatbots (generating text) or image generators based on some
input. This has become so widespread since 2022 that it’s easy to
think that this is “AI”.Under the hood, this is actually pattern matching, since it takes a
prompt, uses all the previous input data, and generates a predicted
output. There isn’t actually “intelligence” under the hood, and it’s
all limited by the power of the algorithms and the source data. It can
be wrong, not useful, etc.  Content generation is definitely not at
human-level, and does not have the wide background and task
knowledge of a human.  It is much faster, though.Some people can use content-generating methods to make predictions,
which isn’t as refined as actual pattern-matching/decision-making
method, but because of the general-purpose nature of content
generation, it can work without much effort.  One example I’ve heard
of is using large language models (LLMs) with an input such as “is
this social media post positive or negative sentiment? Answer with one
word ‘positive’ or ‘negative’”.  You get a sentiment analyzer with
very little work, that has some large implicit background knowledge.
This is the power of these so-called “foundation models” that can do
many tasks.  The downside is it uses much more computing resources and
more chance of going off the rails (hallucination, implicit biases of
input data, etc.).Content generation is powerful, but has potential for misinformation or
misuse on a massive scale. One probably wants to be careful when using
content generation for predictive tasks.  The prevalence of generated
content and its potential for misuse is behind a lot of the backlash
to “AI”.While not a category, there are also commercial platforms that do the
above things. They are set up to be easy to use by a broad
audience. We can help with these things, but most of the actual “AI”
work is already done.  ChatGPT’s early dominance in “AI” was probably
caused almost as much by figuring out a useful, usable interface for
the general public as their underlying “AI” technology.So, while commercial platforms and purchasable tools may be easy, they
usually have very limited information about how they work (hidden
behind “AI” to make you think it’s intelligent), and similar things
can be done locally given enough time and effort.  The delegation of
accountability (and thinking) to third-party platforms is also behind
part of the backlash to “AI”.Examples include the chatbots that everyone uses, coding assistants,
text summarizers, etc.If you, or someone, has an “AI” project, the first step is to think
deeper and figure what is really the goal.  rkdarst has an old saying,
“explain it to me again without any terms invented or made popular in
the last ten years”.  This helps to peel back these layers and get a
description of what is actually needed, and is probably quite useful
when trying to figure out what an “AI” project actually is.  If you can
explain your project without saying “AI”, you are well on the way to
solving it with “AI”.Also, we shouldn’t separate “AI” from computing skills in
general. “AI” lets people do more computing with less work, but it
means everyone needs a higher base level of knowledge (even about
no-exactly-“AI” topics). Think back to when cars became cheaper and
more reliable. Once cars became more common, people didn’t have to
know as much about their intervals, but many more people had to learn
how to drive and interact with them.  (We’re not saying we want our
cities infested with cars, nor AI).  This is true even when the field
of study isn’t “computing”.  Don’t let “AI literacy” become “ChatGPT
literacy”, it is literacy in computing, data, and problem solving -
and a healthy suspicion for details hidden behind jargon.You certainly can benefit from “AI” without knowing all the details.
It’s very hard to benefit from it without knowing your actual goal.Note: this blog post was written with zero “AI” content generation.
My anonymous colleague has contributed some of the key ideas and
title.Machine learning: A field of study using statistical algorithms
to learn from data and generalize to unseen data.Deep learning: Machine learning using multilayered neural networks.Supervised learning: Machine learning methods taking input and
labeled “true” outputs as the training data.Unsupervised learning: Compared to the above, algorithms
learning from input data which is unlabeled.Training: The process of iterating through the input data to
find patterns, resulting in the trained model.Model: The learned parameters from input data and training,
which can be combined with the right code to make predictions.Application programming interface (API): An interface that makes
it easy for a computer code to interact with something (as opposed
to a human-optimized interface).“Artificial intelligence”: Did you think I’m going to give a single
simple definition here?Fundamentals of secure AI systems with personal data -&gt; What is
artificial intelligence?</summary>
    <published>2026-04-27T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2026/dilemma-of-partition-config/</id>
    <title>The dilemma of setting Slurm parameters</title>
    <updated>2026-04-16T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="the-dilemma-of-setting-slurm-parameters"&gt;

&lt;p&gt;Sometimes people come to us and complain that there are idle cluster
GPUs, and they could be used if there wasn’t a per-user limit on max
GPUs that any one user could use at once.  Other people come to us and
complain that all GPUs are in use by various people, and they can’t
start jobs quickly.&lt;/p&gt;
&lt;p&gt;Perhaps you see the dilemma.  People both expect there are usually
available GPUs for them, and also that GPUs can be used to the fullest
extent.  We’ll use GPUs as an example here, but this isn’t specific to
GPUs.&lt;/p&gt;
&lt;p&gt;We are very aware of this and try to enable as much overall research
as possible.  Still, there are choices to be made, and in this post we
will try to describe them, so that our users can better give feedback
for how we should adjust things.&lt;/p&gt;
&lt;p&gt;We wrote this post so users can understand what’s going on in the
background and let us know when something seems wrong.&lt;/p&gt;
&lt;section id="broad-picture"&gt;
&lt;h2&gt;Broad picture&lt;/h2&gt;
&lt;p&gt;A HPC cluster is fundamentally designed for batch work: for a given
amount of resources, schedule them as efficiently as possible to get
the maximum amount of computation out, with as high resource
utilization as possible.  We, and many clusters, have some resources
reserved for interactive use, since interactive testing and debugging
is extremely useful for getting work done.&lt;/p&gt;
&lt;p&gt;We also have a “fairshare” system, where the use of users should be
equalized over the long run.  This means that if one user runs a lot
now (because the cluster was somewhat empty), their priority will be
less later.  The Triton priority decay half-life is 14 days.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-one-user-is-running-too-many-jobs"&gt;
&lt;h2&gt;Problem: one user is running too many jobs&lt;/h2&gt;
&lt;p&gt;The situation: one user is dominating GPU use.  Is this fair?&lt;/p&gt;
&lt;p&gt;If one user was able to fill up the cluster, that means that at that
time the jobs started, there were no other users waiting for those
resources.  If there were multiple users waiting, then the resources
would have been split a bit more fairly (according to their
priorities).&lt;/p&gt;
&lt;p&gt;Don’t worry, once their current jobs finish, their priority will be
much lower and everyone else will have a much higher priority to run
next.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-all-gpus-are-full-until-x-days-from-now"&gt;
&lt;h2&gt;Problem: All GPUs are full until X days from now&lt;/h2&gt;
&lt;p&gt;Situation: one user has a lot of GPUs in use.  I know I can wait for
them to finish, but they last many days.  Do I really have to wait
X days before I can get stuff started?&lt;/p&gt;
&lt;p&gt;Sometimes, if the cluster is free, a user can submit many long jobs.
This means resources aren’t being wasted right now (which is good),
but the resources remain occupied for that duration (max time 3 or 5
days on Triton).  This is a bit annoying.  This is mainly a problem
when the cluster is mostly empty, since if there are lots of things
running, jobs turn over frequently enough that people can get some
resources quickly (and the heavy users have lower priority at the
time, so the more recent users have priority for free slots).&lt;/p&gt;
&lt;p&gt;In this case, we usually wait and just let the situation develop, and
once we get
to a “steady state fullness” jobs cycle fast enough it’s not usual for
the cluster to get to this state.  There aren’t that many free GPUs
opening up all at once without multiple users queuing, so it can’t get
overloaded by one user.&lt;/p&gt;
&lt;p&gt;We don’t want to prohibit all long jobs, since long jobs are useful
especially for new users.  Yes, heavy users can and should adapt to
checkpointing and mainly using small jobs, but we don’t want to force
everyone to go straight into hardest, purest way of using a cluster.&lt;/p&gt;
&lt;p&gt;One option is the Slurm partition parameter &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;GrpTRESRunMins&lt;/span&gt;&lt;/code&gt;
(“Trackable RESources Run Minutes”): this is a limit not on number of
jobs, or length of jobs, but sum(job_resources×job_length).  If this
was 120 GPU-hours, then one could run one 5-day job, or thirty 4-hour
jobs at once.  By tuning this, we can make it where one can run long
jobs, or use all the cluster, but not use all the cluster with long
jobs.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-the-cluster-has-free-slots-but-i-m-not-allowed-to-use-them"&gt;
&lt;h2&gt;Problem: the cluster has free slots but I’m not allowed to use them&lt;/h2&gt;
&lt;p&gt;Situation: There are free GPUs, but Slurm doesn’t let me use them.
Isn’t this a waste?&lt;/p&gt;
&lt;p&gt;Clearly this is the opposite situation of the two situations above.
We’d normally like to prevent this situation, but there are some
reasons it may occur.  Sometimes, we do have a limit on the max number
of jobs that can run.  Hopefully this is temporary while we work
something out.  Sometimes, we have various resources reserved for
short jobs, for interactive jobs, and so on.  Sometimes someone has
bought their own dedicated resources and we want to leave some
available for them.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-i-can-t-get-work-done-in-time-for-a-deadline"&gt;
&lt;h2&gt;Problem: I can’t get work done in time for a deadline&lt;/h2&gt;
&lt;p&gt;Situation: I have a conference deadline in a few days, and I need as
many resources as possible to finish my submission.&lt;/p&gt;
&lt;p&gt;Unfortunately, this isn’t really how a cluster works.  It could work
for clusters that are really bought by one group and they can decide
what runs, but Triton resources are bought for general use and we
don’t free up resources for deadlines.  The fairshare system may also
affect you here, with you getting less resources if you have used a
lot in the past.&lt;/p&gt;
&lt;p&gt;There are other clusters that may be usable and have more free
resources (or you may have higher priority since you haven’t run as
much there lately).  It’s good if you can make your code portable, or
ask for our help early enough in your work and we can help do that.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-it-takes-too-long-to-iterate-code-during-development"&gt;
&lt;h2&gt;Problem: It takes too long to iterate code during development&lt;/h2&gt;
&lt;p&gt;Situation: Each time I submit a job, I have to wait to see if it
works, edit, and try again.  This is slow.&lt;/p&gt;
&lt;p&gt;Indeed.  We try to save some resources in a debug partition
(&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;gpu-debug&lt;/span&gt;&lt;/code&gt;), which are in theory always available (but have a very
short time limit, like 15 minutes).  However, it’s only easy to
allocate a whole node to a debug partition, and four or more GPUs is a
lot to spend on a partition that’s mostly idling, so sometimes we
don’t have the most advanced GPUs available there.&lt;/p&gt;
&lt;p&gt;Triton’s GPU debug partition does overlap with a lot of different
other nodes and has a high partition priority, so if you submit there
it’ll hopefully run ASAP.&lt;/p&gt;
&lt;p&gt;We also have interactive partitions, which you can open in OnDemand to
do development work.  We don’t have GPUs with huge memory there, since
interactive GPUs are mostly idle and not doing computing (we mainly
have older GPUs and Multi-instance-GPUs (MiGs) which split one GPU
into several with smaller memory).  Everything in the rest of this
post, about balancing amount used and convenience, can be repeated
with interactive GPUs.  The more we give for interactive work, the
more GPUs are idle overall.  It’s a balance we are constantly trying
to adjust.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-i-see-a-user-is-using-gpus-inefficiently"&gt;
&lt;h2&gt;Problem: I see a user is using GPUs inefficiently&lt;/h2&gt;
&lt;p&gt;Situation: I’ve tried to see what is slowing stuff down, and noticed
one user has low GPU efficiency.  Should they really be using GPUs in
that case?&lt;/p&gt;
&lt;p&gt;We aren’t aiming for maximum GPU calculations, we aim for getting the
most work done.  Some work is CPU-bound but GPUs can speed up part of
it.  Some work uses other third-party code and can’t be optimized.
Sometimes the bottleneck is just somewhere else but the GPU still
significantly speeds things up.  With this, we don’t want to prevent
someone from doing their work just because it’s not perfectly
GPU-bound.&lt;/p&gt;
&lt;p&gt;We do scan for low efficiency users and invite them to garage to see
if we can make things faster.  If someone is using expensive
resources, we consider there’s an obligation to work with us to make
the usage as efficient as possible.  And yes, sometimes they are using
the GPUs optimally for their own case.  Also note that GPU occupancy
doesn’t mean the GPU is doing useful work - sometimes the measures can
be off.&lt;/p&gt;
&lt;p&gt;If you see a user that you think is inefficiently using the resources,
don’t contact them yourself (unless you are their friend, colleague,
etc.).  Let us know and we’ll investigate if we haven’t done so yet.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-others-are-using-my-group-s-dedicated-resources"&gt;
&lt;h2&gt;Problem: others are using my group’s dedicated resources&lt;/h2&gt;
&lt;p&gt;Situation: My group has purchases dedicated resources and they are
working as part of the cluster.  Someone else is using them, and it
slows down our use.&lt;/p&gt;
&lt;p&gt;We set up the way resources are shared when someone gets the
resources.  Normally, the deal is we want overall highest use of the
cluster, since after all the university is also contributing
significant sysadmin and electricity resources.  We don’t necessarily
guarantee that you can use it right away, but we try to make it as
close to that as possible.  With some dedicated resources, we have
used preemptible jobs (see below) for the “common” access.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="preemptible-jobs"&gt;
&lt;h2&gt;Preemptible jobs&lt;/h2&gt;
&lt;p&gt;One solution to many of the things above is to make jobs in partitions
&lt;strong&gt;preemptible&lt;/strong&gt;, which means that if a higher priority jobs comes
along, a currently running job can be killed.  It’s killed with a
short grace period to save its state (which it should be designed to
do) so that it can be resumed.&lt;/p&gt;
&lt;p&gt;Preemptible jobs are great since they allow all the otherwise-unused
resources to be scheduled.  However, it can be a big step up with
effort to manage saving state and scripting the continuation of jobs
at scale.  We want new users to have some easy onboarding path, so
we will always make preemptible jobs opt-in.  We expect that big users
will have enough benefit to adapting to preemptible jobs, which helps
to improve efficiency for everyone else who can’t.&lt;/p&gt;
&lt;p&gt;If you can adapt your work to use preemptible jobs (and you are using
a cluster that has them enabled), then we encourage you to make use of
that option.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="summary-of-things-that-can-be-tuned-per-partition"&gt;
&lt;h2&gt;Summary of things that can be tuned per-partition&lt;/h2&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Partition layout and overlaps&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Maximum runtime&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Maximum job size&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GrpTRESRunMins&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Preemptibility&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most importantly, while it may be possible to make some theoretically
perfect arrangement for maximum use and minimum waiting when not
expected, that can make the cluster usage much harder to explain.  So
we try to find a balance of those things and overall usability.  So
then, at the end, it becomes a trilemma: maximum resource usage,
resources always standing by for you, and usability.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2026/dilemma-of-partition-config/"/>
    <summary>Sometimes people come to us and complain that there are idle cluster
GPUs, and they could be used if there wasn’t a per-user limit on max
GPUs that any one user could use at once.  Other people come to us and
complain that all GPUs are in use by various people, and they can’t
start jobs quickly.Perhaps you see the dilemma.  People both expect there are usually
available GPUs for them, and also that GPUs can be used to the fullest
extent.  We’ll use GPUs as an example here, but this isn’t specific to
GPUs.We are very aware of this and try to enable as much overall research
as possible.  Still, there are choices to be made, and in this post we
will try to describe them, so that our users can better give feedback
for how we should adjust things.We wrote this post so users can understand what’s going on in the
background and let us know when something seems wrong.A HPC cluster is fundamentally designed for batch work: for a given
amount of resources, schedule them as efficiently as possible to get
the maximum amount of computation out, with as high resource
utilization as possible.  We, and many clusters, have some resources
reserved for interactive use, since interactive testing and debugging
is extremely useful for getting work done.We also have a “fairshare” system, where the use of users should be
equalized over the long run.  This means that if one user runs a lot
now (because the cluster was somewhat empty), their priority will be
less later.  The Triton priority decay half-life is 14 days.The situation: one user is dominating GPU use.  Is this fair?If one user was able to fill up the cluster, that means that at that
time the jobs started, there were no other users waiting for those
resources.  If there were multiple users waiting, then the resources
would have been split a bit more fairly (according to their
priorities).Don’t worry, once their current jobs finish, their priority will be
much lower and everyone else will have a much higher priority to run
next.Situation: one user has a lot of GPUs in use.  I know I can wait for
them to finish, but they last many days.  Do I really have to wait
X days before I can get stuff started?Sometimes, if the cluster is free, a user can submit many long jobs.
This means resources aren’t being wasted right now (which is good),
but the resources remain occupied for that duration (max time 3 or 5
days on Triton).  This is a bit annoying.  This is mainly a problem
when the cluster is mostly empty, since if there are lots of things
running, jobs turn over frequently enough that people can get some
resources quickly (and the heavy users have lower priority at the
time, so the more recent users have priority for free slots).In this case, we usually wait and just let the situation develop, and
once we get
to a “steady state fullness” jobs cycle fast enough it’s not usual for
the cluster to get to this state.  There aren’t that many free GPUs
opening up all at once without multiple users queuing, so it can’t get
overloaded by one user.We don’t want to prohibit all long jobs, since long jobs are useful
especially for new users.  Yes, heavy users can and should adapt to
checkpointing and mainly using small jobs, but we don’t want to force
everyone to go straight into hardest, purest way of using a cluster.One option is the Slurm partition parameter GrpTRESRunMins
(“Trackable RESources Run Minutes”): this is a limit not on number of
jobs, or length of jobs, but sum(job_resources×job_length).  If this
was 120 GPU-hours, then one could run one 5-day job, or thirty 4-hour
jobs at once.  By tuning this, we can make it where one can run long
jobs, or use all the cluster, but not use all the cluster with long
jobs.Situation: There are free GPUs, but Slurm doesn’t let me use them.
Isn’t this a waste?Clearly this is the opposite situation of the two situations above.
We’d normally like to prevent this situation, but there are some
reasons it may occur.  Sometimes, we do have a limit on the max number
of jobs that can run.  Hopefully this is temporary while we work
something out.  Sometimes, we have various resources reserved for
short jobs, for interactive jobs, and so on.  Sometimes someone has
bought their own dedicated resources and we want to leave some
available for them.Situation: I have a conference deadline in a few days, and I need as
many resources as possible to finish my submission.Unfortunately, this isn’t really how a cluster works.  It could work
for clusters that are really bought by one group and they can decide
what runs, but Triton resources are bought for general use and we
don’t free up resources for deadlines.  The fairshare system may also
affect you here, with you getting less resources if you have used a
lot in the past.There are other clusters that may be usable and have more free
resources (or you may have higher priority since you haven’t run as
much there lately).  It’s good if you can make your code portable, or
ask for our help early enough in your work and we can help do that.Situation: Each time I submit a job, I have to wait to see if it
works, edit, and try again.  This is slow.Indeed.  We try to save some resources in a debug partition
(gpu-debug), which are in theory always available (but have a very
short time limit, like 15 minutes).  However, it’s only easy to
allocate a whole node to a debug partition, and four or more GPUs is a
lot to spend on a partition that’s mostly idling, so sometimes we
don’t have the most advanced GPUs available there.Triton’s GPU debug partition does overlap with a lot of different
other nodes and has a high partition priority, so if you submit there
it’ll hopefully run ASAP.We also have interactive partitions, which you can open in OnDemand to
do development work.  We don’t have GPUs with huge memory there, since
interactive GPUs are mostly idle and not doing computing (we mainly
have older GPUs and Multi-instance-GPUs (MiGs) which split one GPU
into several with smaller memory).  Everything in the rest of this
post, about balancing amount used and convenience, can be repeated
with interactive GPUs.  The more we give for interactive work, the
more GPUs are idle overall.  It’s a balance we are constantly trying
to adjust.Situation: I’ve tried to see what is slowing stuff down, and noticed
one user has low GPU efficiency.  Should they really be using GPUs in
that case?We aren’t aiming for maximum GPU calculations, we aim for getting the
most work done.  Some work is CPU-bound but GPUs can speed up part of
it.  Some work uses other third-party code and can’t be optimized.
Sometimes the bottleneck is just somewhere else but the GPU still
significantly speeds things up.  With this, we don’t want to prevent
someone from doing their work just because it’s not perfectly
GPU-bound.We do scan for low efficiency users and invite them to garage to see
if we can make things faster.  If someone is using expensive
resources, we consider there’s an obligation to work with us to make
the usage as efficient as possible.  And yes, sometimes they are using
the GPUs optimally for their own case.  Also note that GPU occupancy
doesn’t mean the GPU is doing useful work - sometimes the measures can
be off.If you see a user that you think is inefficiently using the resources,
don’t contact them yourself (unless you are their friend, colleague,
etc.).  Let us know and we’ll investigate if we haven’t done so yet.Situation: My group has purchases dedicated resources and they are
working as part of the cluster.  Someone else is using them, and it
slows down our use.We set up the way resources are shared when someone gets the
resources.  Normally, the deal is we want overall highest use of the
cluster, since after all the university is also contributing
significant sysadmin and electricity resources.  We don’t necessarily
guarantee that you can use it right away, but we try to make it as
close to that as possible.  With some dedicated resources, we have
used preemptible jobs (see below) for the “common” access.One solution to many of the things above is to make jobs in partitions
preemptible, which means that if a higher priority jobs comes
along, a currently running job can be killed.  It’s killed with a
short grace period to save its state (which it should be designed to
do) so that it can be resumed.Preemptible jobs are great since they allow all the otherwise-unused
resources to be scheduled.  However, it can be a big step up with
effort to manage saving state and scripting the continuation of jobs
at scale.  We want new users to have some easy onboarding path, so
we will always make preemptible jobs opt-in.  We expect that big users
will have enough benefit to adapting to preemptible jobs, which helps
to improve efficiency for everyone else who can’t.If you can adapt your work to use preemptible jobs (and you are using
a cluster that has them enabled), then we encourage you to make use of
that option.Partition layout and overlapsMaximum runtimeMaximum job sizeGrpTRESRunMinsPreemptibilityMost importantly, while it may be possible to make some theoretically
perfect arrangement for maximum use and minimum waiting when not
expected, that can make the cluster usage much harder to explain.  So
we try to find a balance of those things and overall usability.  So
then, at the end, it becomes a trilemma: maximum resource usage,
resources always standing by for you, and usability.</summary>
    <published>2026-04-16T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2025/ucl-rse-visit/</id>
    <title>RSE report from visiting University College London Advanced Research Computing</title>
    <updated>2025-05-08T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="rse-report-from-visiting-university-college-london-advanced-research-computing"&gt;

&lt;p&gt;Richard Darst visited UCL RSE to see what we can learn from them, and
learned the following from speaking to Jonathan Cooper, the Director
of Collaborations at ARC.  Thanks to Jonathan for checking and
improving this post.&lt;/p&gt;
&lt;section id="about-rse-at-ucl"&gt;
&lt;h2&gt;About RSE at UCL&lt;/h2&gt;
&lt;p&gt;The RSE team at UCL (“collaborations and consultancy”) is part of Advanced Research Computing Centre
(&lt;a class="reference external" href="https://www.ucl.ac.uk/advanced-research-computing/"&gt;ARC&lt;/a&gt;), which is a
department-level organization within the university.  Like Science-IT
at Aalto, it is not part of their IT Services, but also not part an
academic department/faculty: it’s sort of their own thing.  Overall,
they have around 120 people, of which about 60 are in the
“Collaborations Team”, which is basically the RSE side of things
(broader, as they have more professions, but the equivalent in type of
activity as the RSE side of Science-IT).  They are professional (not
academic career track) staff, but ARC does have some academic
affiliation which allows them to take part in academic teaching when
it’s needed.&lt;/p&gt;
&lt;p&gt;The actual RSE work is part of “&lt;a class="reference external" href="https://www.ucl.ac.uk/advanced-research-computing/collaborations-and-consultancy"&gt;Collaborations and Consultancy&lt;/a&gt;”
and consists of jobs titles such as Data Scientists, Data Stewards,
Research Software Engineers, Digital Research Managers, and Research
Infrastructure Developers.  It is not that different from Aalto RSE
work, but their scale means they have more emphasis on large projects.
Many of their staff are assigned to various projects for months at a
time (usually more than one project, but not too many - actual
day-to-day time allocation is decided based on what works for each
project).  Many projects have a more senior and junior staff member
working together, or at least aware of the work.&lt;/p&gt;
&lt;p&gt;Other sides of ARC are platforms &amp;amp; services (HPC and many more
research-focused IT services) and teaching/training.  Overall, ARC and
ARC Collaborations are quite similar to Science-IT and Aalto RSE, but
at a much larger and more industrial scale, capable of helping very
many more people.&lt;/p&gt;
&lt;p&gt;ARC’s RSE team has grown over the years.  It started in 2012 with 3
posts, and by around 2016-2017 it was 8 people and began growing very
rapidly.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="rses-within-research-groups"&gt;
&lt;h2&gt;RSEs within research groups&lt;/h2&gt;
&lt;p&gt;I was especially interested in the work of RSE within research groups
(this was the original purpose of the meeting).  There are several
main strategies they use.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;A project provides funding and ARC staff member(s) are deployed to
work on the project, scheduling as it makes sense. This is the most
common approach, for large and small projects. All ARC staff area
are on permanent contracts because when any given project funding
ends, they know there will be other projects available.  This is
what we have done with some flagship projects.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“Professional opportunities” - this is a special middle ground.
Someone is hired as a RSE to a research group, with ARC’s help in
recruitment.  They are paid and work within the group (with a job
title such as postdoc). ARC supports their work and makes sure it
goes well.  After the funding is over, the person is pre-approved to
join ARC permanently if they wish (assuming things have gone well,
which thanks to ARC’s help, they always do).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“Associate ARC membership” - people outside of ARC can become
associated and be part of chat channels, meetings, internal
trainings, etc.  This allows others to improve their research
engineering skills even without being employed by ARC.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Like we have found, they think that full time permanent employment is
best for professional RSE work.  Short-term contracts like postdocs
results in mixed motivations and constant turnover.  They have scale
so that they know people are always being hired, so that they can
essentially promise postdocs positions when they are done.  This
allows higher quality candidates and for them to focus on their RSE
work, not think about what (academic) job might come next.  They
currently (in 2025) have enough history to have approval to hire 6 new
staff each year (even without knowing exactly what projects they will
work on).&lt;/p&gt;
&lt;p&gt;Just like at Aalto, one can also ask “should these staff better be
hired within research projects in academic departments?”  And just
like at Aalto, the answer is “yes, it could make sense - but these
departments aren’t current set up to teach, mentor, supervise,
promote, and keep RSE skills.”  Thus it’s a separate team, just like
at Aalto.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="lessons-for-aalto"&gt;
&lt;h2&gt;Lessons for Aalto&lt;/h2&gt;
&lt;p&gt;Overall, my evaluation is that if ARC is doing a lot of things right,
Aalto Science-IT is doing it right too.  There are implementation
differences as you might expect from the different national and
university environment.  My thoughts for the future are:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;ARC RSE began growing very rapidly after about 5 years of existence
(6-8 staff/year).  Aalto RSE is that point now.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hiring RSEs to research groups is possible, but is different than
hiring to a central team.  It requires close cooperation and thought
about their future career paths, otherwise the effect is only a bit
different from a postdoc.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Research computing is different than university IT Services, I like
the way UCL research computing is a different department, but
practically speaking I don’t think we can or should go towards that
now.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2025/ucl-rse-visit/"/>
    <summary>Richard Darst visited UCL RSE to see what we can learn from them, and
learned the following from speaking to Jonathan Cooper, the Director
of Collaborations at ARC.  Thanks to Jonathan for checking and
improving this post.The RSE team at UCL (“collaborations and consultancy”) is part of Advanced Research Computing Centre
(ARC), which is a
department-level organization within the university.  Like Science-IT
at Aalto, it is not part of their IT Services, but also not part an
academic department/faculty: it’s sort of their own thing.  Overall,
they have around 120 people, of which about 60 are in the
“Collaborations Team”, which is basically the RSE side of things
(broader, as they have more professions, but the equivalent in type of
activity as the RSE side of Science-IT).  They are professional (not
academic career track) staff, but ARC does have some academic
affiliation which allows them to take part in academic teaching when
it’s needed.The actual RSE work is part of “Collaborations and Consultancy”
and consists of jobs titles such as Data Scientists, Data Stewards,
Research Software Engineers, Digital Research Managers, and Research
Infrastructure Developers.  It is not that different from Aalto RSE
work, but their scale means they have more emphasis on large projects.
Many of their staff are assigned to various projects for months at a
time (usually more than one project, but not too many - actual
day-to-day time allocation is decided based on what works for each
project).  Many projects have a more senior and junior staff member
working together, or at least aware of the work.Other sides of ARC are platforms &amp; services (HPC and many more
research-focused IT services) and teaching/training.  Overall, ARC and
ARC Collaborations are quite similar to Science-IT and Aalto RSE, but
at a much larger and more industrial scale, capable of helping very
many more people.ARC’s RSE team has grown over the years.  It started in 2012 with 3
posts, and by around 2016-2017 it was 8 people and began growing very
rapidly.I was especially interested in the work of RSE within research groups
(this was the original purpose of the meeting).  There are several
main strategies they use.A project provides funding and ARC staff member(s) are deployed to
work on the project, scheduling as it makes sense. This is the most
common approach, for large and small projects. All ARC staff area
are on permanent contracts because when any given project funding
ends, they know there will be other projects available.  This is
what we have done with some flagship projects.“Professional opportunities” - this is a special middle ground.
Someone is hired as a RSE to a research group, with ARC’s help in
recruitment.  They are paid and work within the group (with a job
title such as postdoc). ARC supports their work and makes sure it
goes well.  After the funding is over, the person is pre-approved to
join ARC permanently if they wish (assuming things have gone well,
which thanks to ARC’s help, they always do).“Associate ARC membership” - people outside of ARC can become
associated and be part of chat channels, meetings, internal
trainings, etc.  This allows others to improve their research
engineering skills even without being employed by ARC.Like we have found, they think that full time permanent employment is
best for professional RSE work.  Short-term contracts like postdocs
results in mixed motivations and constant turnover.  They have scale
so that they know people are always being hired, so that they can
essentially promise postdocs positions when they are done.  This
allows higher quality candidates and for them to focus on their RSE
work, not think about what (academic) job might come next.  They
currently (in 2025) have enough history to have approval to hire 6 new
staff each year (even without knowing exactly what projects they will
work on).Just like at Aalto, one can also ask “should these staff better be
hired within research projects in academic departments?”  And just
like at Aalto, the answer is “yes, it could make sense - but these
departments aren’t current set up to teach, mentor, supervise,
promote, and keep RSE skills.”  Thus it’s a separate team, just like
at Aalto.Overall, my evaluation is that if ARC is doing a lot of things right,
Aalto Science-IT is doing it right too.  There are implementation
differences as you might expect from the different national and
university environment.  My thoughts for the future are:ARC RSE began growing very rapidly after about 5 years of existence
(6-8 staff/year).  Aalto RSE is that point now.Hiring RSEs to research groups is possible, but is different than
hiring to a central team.  It requires close cooperation and thought
about their future career paths, otherwise the effect is only a bit
different from a postdoc.Research computing is different than university IT Services, I like
the way UCL research computing is a different department, but
practically speaking I don’t think we can or should go towards that
now.</summary>
    <published>2025-05-08T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/what-is-a-rse/</id>
    <title>What is a Research Software Engineer?</title>
    <updated>2024-11-20T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="what-is-a-research-software-engineer"&gt;

&lt;p&gt;What is a Research Software Engineer (RSE)?  Too many things to
define, you can find these definitions elsewhere.  Maybe the question
you would like to know is &lt;strong&gt;How do I get value from Research Engineers
like Aalto Scientific Computing does?&lt;/strong&gt; - that’s what we’ll try to
answer here.  Through that, we may learn a functional definition.&lt;/p&gt;
&lt;p&gt;This page is written from the perspective of &lt;em&gt;computational science&lt;/em&gt; -
similar messages may apply to other fields.  Note that computing and
AI is in every field now.&lt;/p&gt;
&lt;section id="university-roles"&gt;
&lt;h2&gt;University roles&lt;/h2&gt;
&lt;p&gt;When someone wants Research Engineers, it’s probably because they see
something that is missing in the current academic system.  Thus, to
understand what we want, we need to understand the system.  Below is
rkdarst’s current mental model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Academics&lt;/strong&gt; are who we usually consider researchers.  They do
research, and are promoted based on articles published and citations
received from other academics.  Citations from academics tend to
focus on innovation and novelness, so that’s what decides career
paths.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Research Engineers (REs)&lt;/strong&gt; focus on the practice and “structural
integrity” of the research: the tools, the reproducibility, and
more.  They are more concerned with the work being done well, than
pure novelness and citations. &lt;a class="footnote-reference brackets" href="#id2" id="id1" role="doc-noteref"&gt;&lt;span class="fn-bracket"&gt;[&lt;/span&gt;0&lt;span class="fn-bracket"&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Research Software Engineers (RSEs) are a subset of Research
Engineers, and I feel that the “software” is the least significant
part there.  Software is important, but so is data, computing,
reproducibility, etc.&lt;/p&gt;
&lt;p&gt;Particular examples of things that Research Engineers are good at
include: Reproduciblity, maintaining software and data across academic
generations, Open Science, programming, using large computer
clusters, data security, and research ethics processes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Researchers&lt;/strong&gt;, in my mind, cover both of the above (and more).
Industrial research teams would have both of the above and possibly
even more different roles all working together on their problems.
In universities, we tend to only consider the academics to
researchers.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So what’s a Research Engineer?  To me, it’s defined mostly in terms of
what is missing from the typical academic career path (of
undergraduate → junior researcher → senior researcher).  At all
levels, I’ve seen research engineering under-valued and under-taught
(not necessarily because it’s not wanted, but because it’s not novel
science and there’s no time).  Senior researchers (group leaders)
often see the value, but don’t have the time (or sometimes ability) to
train and supervise research engineers well.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="how-aalto-rse-filled-the-gap"&gt;
&lt;h2&gt;How Aalto RSE filled the gap&lt;/h2&gt;
&lt;p&gt;Years before Aalto RSE started (~2017-2018), I saw a need for more
basic skills (for example: version control to manage code) and worked
to promote them in undergraduate programs.  This basically didn’t
work, because they were seen as not scientific thus not something to
be taught in academic courses (and if they were thought, the courses
would be full of people looking for easy ECTS).  While there certainly
are study programs in software and software engineering, these are
their own thing, and not part of data science, or other fields that
need computation.  Software engineering programs also aren’t adapted
for the unpredictibility of research.&lt;/p&gt;
&lt;p&gt;This was the prompting to start Aalto Research Software Engineers - if
we can’t teach people skills in study programs, we have to support
them when they become researchers (and teach it via practical
mentoring).  This has worked out very well, as you can see by our
rapid expansion and heavy usage.&lt;/p&gt;
&lt;p&gt;Aalto RSE is essentially the collaborator our research groups need to
do their top-level work.  This system works very well, but are there
other options?&lt;/p&gt;
&lt;/section&gt;
&lt;section id="how-to-get-research-engineering-competence-in-universities"&gt;
&lt;h2&gt;How to get research engineering competence in universities?&lt;/h2&gt;
&lt;p&gt;The above leads to various ideas.  Take your pick for what angle
you want to approach the problem:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Better RE teaching in undergraduate programs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;As part of existing programs (is there time to teach this?  Is
there desire?  On the other hand, RE skills are great for
employment prospects)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;As dedicated majors? (Some people are trying to make dedicated RSE
study programs at different universities, and there is a value
there.  But if you ask me the best value is learning RE along with
academic research in a different field)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Better RE teaching in graduate programs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Many of the same things as above apply here, mainly the lack of
time, and the necessity to spend time on novel research, not
learning existing best practices.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Nurture REs within existing research groups:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Nothing stops group leaders from hiring students and postdocs who
have chosen to focus on research engineering.  This often happens
when supervisors hire technical postdocs to manage the RE side of
things.  (The question is: can they be supervised and mentored
well by academic supervisors, if they need to be home-grown?)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If group leaders hire good candidates, Aalto RSE can help mentor
them.  See the companion blog post &lt;a class="reference internal" href="../2024/rse-work-rotations/"&gt;&lt;span class="doc"&gt;RSE work rotations&lt;/span&gt;&lt;/a&gt;
for one idea.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Recruit REs as professor-level group leaders” similar to how senior
academics are recruited:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;These people would focus on collaborating with others to make
projects possible.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The university systems don’t seem set up to value these people,
thus they don’t appear among the ranks.  They could appear if they
spent their careers chasing academic citations, but then would
they be able to spend enough time on research engineering?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I think this is what some people mean when they say they want a
RSE career path: a way to recruit senior academics who lead
research engineering groups.  I think the idea is good but it’s
not how universities are set up, so it’s a long way off.  The
values systems may not even match up.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Create parallel structures that support research engineering&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;That is what Aalto RSE has done.  We are researchers, but we make
new research possible by collaborating with academics, instead of
trying to publish by ourselves.  We are part of the services of
the School of Science.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We also take it upon ourselves to do teaching and mentoring via
co-working for all types of researchers (aspiring academic or
research engineer).  We can fill in the technical mentoring that’s
missing by many supervisors.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can read about &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/history/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;our history&lt;/span&gt;&lt;/a&gt; in
more detail.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="getting-your-own-research-engineers"&gt;
&lt;h2&gt;Getting your own research engineers&lt;/h2&gt;
&lt;p&gt;I’ve seen many people interested in gaining research engineering
competence for their organization.  You need to develop an environment
where they fill in the gaps you need.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Junior academics: encourage them to explore their technical skills.
Show that there is value in this, even if it reduces the number of
publications.  Encourage them to get training (for example the Aalto
RSE training).  Give them time, encouragement, and career prospects
to reach beyond the focus on papers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Other support staff at universities and other organizations: don’t
view them as limited-purpose supporters of an {infrastructure,
service, process}.  View them as supporters of research: let them
holistically support research projects from many angles at once,
rather than only in narrow silos with strict project reporting
requirements.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can hire dedicated staff to be REs, but it’s important that they
are integrated into the local research environment.  Most of our
hires have been local staff who have grown into a new role, and I
think this is how it should be.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Any of the above, especially the first two, require time being made
available for RE work and a clear vision and network.  Aalto RSE (with
the help of others in Finland) is planning on making a networking and
onboarding program for new research engineers who wish to adopt this
vision.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="summary"&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;If you read this far, you probably see the value in research engineers
and want them yourself.  Just hiring someone, or changing someone’s
job to “RSE”, won’t magically solve the problem you need.  It’s a
whole mindset shift towards a multi-disciplinary research team.&lt;/p&gt;
&lt;p&gt;What’s the right level of research engineers, permanent and
experienced or junior and learning?  Probably a bit of both.&lt;/p&gt;
&lt;p class="rubric"&gt;Footnotes&lt;/p&gt;
&lt;aside class="footnote-list brackets"&gt;
&lt;aside class="footnote brackets" id="id2" role="doc-footnote"&gt;
&lt;span class="label"&gt;&lt;span class="fn-bracket"&gt;[&lt;/span&gt;&lt;a role="doc-backlink" href="#id1"&gt;0&lt;/a&gt;&lt;span class="fn-bracket"&gt;]&lt;/span&gt;&lt;/span&gt;
&lt;p&gt;I know that “Research Engineer” is a job title that can have
other definitions.&lt;/p&gt;
&lt;/aside&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/what-is-a-rse/"/>
    <summary>What is a Research Software Engineer (RSE)?  Too many things to
define, you can find these definitions elsewhere.  Maybe the question
you would like to know is How do I get value from Research Engineers
like Aalto Scientific Computing does? - that’s what we’ll try to
answer here.  Through that, we may learn a functional definition.This page is written from the perspective of computational science -
similar messages may apply to other fields.  Note that computing and
AI is in every field now.When someone wants Research Engineers, it’s probably because they see
something that is missing in the current academic system.  Thus, to
understand what we want, we need to understand the system.  Below is
rkdarst’s current mental model:Academics are who we usually consider researchers.  They do
research, and are promoted based on articles published and citations
received from other academics.  Citations from academics tend to
focus on innovation and novelness, so that’s what decides career
paths.Research Engineers (REs) focus on the practice and “structural
integrity” of the research: the tools, the reproducibility, and
more.  They are more concerned with the work being done well, than
pure novelness and citations. 0Research Software Engineers (RSEs) are a subset of Research
Engineers, and I feel that the “software” is the least significant
part there.  Software is important, but so is data, computing,
reproducibility, etc.Particular examples of things that Research Engineers are good at
include: Reproduciblity, maintaining software and data across academic
generations, Open Science, programming, using large computer
clusters, data security, and research ethics processes.Researchers, in my mind, cover both of the above (and more).
Industrial research teams would have both of the above and possibly
even more different roles all working together on their problems.
In universities, we tend to only consider the academics to
researchers.So what’s a Research Engineer?  To me, it’s defined mostly in terms of
what is missing from the typical academic career path (of
undergraduate → junior researcher → senior researcher).  At all
levels, I’ve seen research engineering under-valued and under-taught
(not necessarily because it’s not wanted, but because it’s not novel
science and there’s no time).  Senior researchers (group leaders)
often see the value, but don’t have the time (or sometimes ability) to
train and supervise research engineers well.Years before Aalto RSE started (~2017-2018), I saw a need for more
basic skills (for example: version control to manage code) and worked
to promote them in undergraduate programs.  This basically didn’t
work, because they were seen as not scientific thus not something to
be taught in academic courses (and if they were thought, the courses
would be full of people looking for easy ECTS).  While there certainly
are study programs in software and software engineering, these are
their own thing, and not part of data science, or other fields that
need computation.  Software engineering programs also aren’t adapted
for the unpredictibility of research.This was the prompting to start Aalto Research Software Engineers - if
we can’t teach people skills in study programs, we have to support
them when they become researchers (and teach it via practical
mentoring).  This has worked out very well, as you can see by our
rapid expansion and heavy usage.Aalto RSE is essentially the collaborator our research groups need to
do their top-level work.  This system works very well, but are there
other options?The above leads to various ideas.  Take your pick for what angle
you want to approach the problem:Better RE teaching in undergraduate programs:As part of existing programs (is there time to teach this?  Is
there desire?  On the other hand, RE skills are great for
employment prospects)As dedicated majors? (Some people are trying to make dedicated RSE
study programs at different universities, and there is a value
there.  But if you ask me the best value is learning RE along with
academic research in a different field)Better RE teaching in graduate programs:Many of the same things as above apply here, mainly the lack of
time, and the necessity to spend time on novel research, not
learning existing best practices.Nurture REs within existing research groups:Nothing stops group leaders from hiring students and postdocs who
have chosen to focus on research engineering.  This often happens
when supervisors hire technical postdocs to manage the RE side of
things.  (The question is: can they be supervised and mentored
well by academic supervisors, if they need to be home-grown?)If group leaders hire good candidates, Aalto RSE can help mentor
them.  See the companion blog post /2024/rse-work-rotations
for one idea.Recruit REs as professor-level group leaders” similar to how senior
academics are recruited:These people would focus on collaborating with others to make
projects possible.The university systems don’t seem set up to value these people,
thus they don’t appear among the ranks.  They could appear if they
spent their careers chasing academic citations, but then would
they be able to spend enough time on research engineering?I think this is what some people mean when they say they want a
RSE career path: a way to recruit senior academics who lead
research engineering groups.  I think the idea is good but it’s
not how universities are set up, so it’s a long way off.  The
values systems may not even match up.Create parallel structures that support research engineeringThat is what Aalto RSE has done.  We are researchers, but we make
new research possible by collaborating with academics, instead of
trying to publish by ourselves.  We are part of the services of
the School of Science.We also take it upon ourselves to do teaching and mentoring via
co-working for all types of researchers (aspiring academic or
research engineer).  We can fill in the technical mentoring that’s
missing by many supervisors.You can read about our history in
more detail.I’ve seen many people interested in gaining research engineering
competence for their organization.  You need to develop an environment
where they fill in the gaps you need.Junior academics: encourage them to explore their technical skills.
Show that there is value in this, even if it reduces the number of
publications.  Encourage them to get training (for example the Aalto
RSE training).  Give them time, encouragement, and career prospects
to reach beyond the focus on papers.Other support staff at universities and other organizations: don’t
view them as limited-purpose supporters of an {infrastructure,
service, process}.  View them as supporters of research: let them
holistically support research projects from many angles at once,
rather than only in narrow silos with strict project reporting
requirements.You can hire dedicated staff to be REs, but it’s important that they
are integrated into the local research environment.  Most of our
hires have been local staff who have grown into a new role, and I
think this is how it should be.Any of the above, especially the first two, require time being made
available for RE work and a clear vision and network.  Aalto RSE (with
the help of others in Finland) is planning on making a networking and
onboarding program for new research engineers who wish to adopt this
vision.If you read this far, you probably see the value in research engineers
and want them yourself.  Just hiring someone, or changing someone’s
job to “RSE”, won’t magically solve the problem you need.  It’s a
whole mindset shift towards a multi-disciplinary research team.What’s the right level of research engineers, permanent and
experienced or junior and learning?  Probably a bit of both.I know that “Research Engineer” is a job title that can have
other definitions.</summary>
    <published>2024-11-20T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/how-busy-is-the-cluster/</id>
    <title>How busy is the cluster?  A discussion</title>
    <updated>2024-10-21T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="how-busy-is-the-cluster-a-discussion"&gt;

&lt;p&gt;We occasionally get some questions: how busy is the cluster?  How
long do I have to wait?  Is there some dashboard that can tell me?&lt;/p&gt;
&lt;p&gt;The answer is, unfortunately, not so easy.  &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Our cluster&lt;/span&gt;&lt;/a&gt; uses dynamic scheduling with a fairshare algorithm.
All users have a fairshare priority, which decreases the more you have
recently run.  Jobs are ranked by priority (including fairshare plus
other factors), and scheduled in that order.  If there are
unschedulable holes between those jobs, it can take a job with a lower
priority and fill them in (“backfilling”).  So that gives us:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;A small-enough job with a low priority might still be scheduled
soon.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A higher priority user could submit something while you are waiting,
and increase your wait time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An existing job could end early, making other wait times shorter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An existing job could end early, allowing some other higher priority
jobs to run sooner, making backfilled jobs run later.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In short: there is no way to give an estimate of the wait time, in the
way people want.  We’ve tried but haven’t find a way to answer the
question well.&lt;/p&gt;
&lt;p&gt;What can we know?&lt;/p&gt;
&lt;section id="priority-comparison"&gt;
&lt;h2&gt;Priority comparison&lt;/h2&gt;
&lt;p&gt;You can compare your fairshare factor with other users.  If you run
&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;sshare&lt;/span&gt;&lt;/code&gt; you can see the fairshare (higher means higher priority).
&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;sprio&lt;/span&gt;&lt;/code&gt; shows relatively priority for all jobs (here, the raw values
are multiplied by some factor and added).  On Triton (the new install
since 2024 may), they mean the following:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;dl class="simple"&gt;
&lt;dt&gt;The “age” value is “1e4 × (1-(time_in_queue/7day))” (but maxes out&lt;/dt&gt;&lt;dd&gt;&lt;p&gt;at 7 days) (zero when first submitted, increasing to 10000 at 7
days old)&lt;/p&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The fairshare factor is “1e7 × FairShare priority from sshare”&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The FairShare value is computed based on the raw usage value: at
each level of the share tree, it divides it up among the users so
that those who have run less have a higher priority.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The usage value decays with a two-week half life.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The others are mostly constant.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Still: this is all very abstract and what others submit has more
effect than your priority.  The only thing you can control is using
less resources.&lt;/p&gt;
&lt;p&gt;This is quite cluster dependent so we’d recommend asking for help for
how your own cluster is setup.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="how-to-get-jobs-scheduled-sooner"&gt;
&lt;h2&gt;How to get jobs scheduled sooner&lt;/h2&gt;
&lt;p&gt;This may be your real question. There are two main things:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Use less resources.  &lt;strong&gt;Make sure you don’t over-request more than
you need (CPU, memory, GPUs)&lt;/strong&gt; - this will affect your future
fairshare less. Of course, use everything you need, “saving for
later” doesn’t give you more resources than you save now.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Request less resources per job.  This will let you be backfilled
into the scheduling holes (see below).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="when-the-cluster-is-mostly-empty"&gt;
&lt;h2&gt;When the cluster is mostly empty&lt;/h2&gt;
&lt;p&gt;In this case, if there is a slot for you, you are scheduled very soon.
&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;srun&lt;/span&gt; &lt;span class="pre"&gt;--test&lt;/span&gt; &lt;span class="pre"&gt;[RESOURCE_REQUESTS]&lt;/span&gt;&lt;/code&gt; might give you some hint about
when a job would be scheduled - it basically tries to schedule an
empty job and reports the currently estimated start time. (It uses a
JobID though so don’t run it in a loop)&lt;/p&gt;
&lt;/section&gt;
&lt;section id="the-cluster-has-a-long-queue"&gt;
&lt;h2&gt;The cluster has a long queue&lt;/h2&gt;
&lt;p&gt;In this case, nothing can be said since the queue is always being
re-shuffled.  In the long-run, you get a fair share of resources.  If
you haven’t used much lately, you have more now.  Your wait time
depends more on what other users submit (and their priorities) than
what you submit - and this is always changing.  You can tell something
about how soon you’d be scheduled by looking at your priority relative
to other users.  Make your jobs as small and efficient as possible to
fit in between the holes of other jobs and get scheduled as soon as
possible.  If you can break one big job into smaller pieces (less
time, less CPU, less memory) that depend on each other, then you can
better fit in between all of the big jobs.  See the &lt;a class="reference external" href="https://coderefinery.github.io/TTT4HPC_resource_management/scheduling/"&gt;Tetris metaphor
here in TTT4HPC&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If your need is “run stuff quickly for testing”, make sure the jobs
are as short as possible.  Hopefully, your cluster staff about
development or debugging partitions that may be of use, because that’s
the solution for quick tests.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="long-older-description"&gt;
&lt;h2&gt;Long, older description&lt;/h2&gt;
&lt;p&gt;This description was in an old version of our docs but has since been
removed.  The exact values are out of date.  It’s included here for
detailed reference anyway:&lt;/p&gt;
&lt;p&gt;Triton queues are not first-in first-out, but “fairshare”.  This means
that every person has a priority.  The more you run the lower your
user priority.  As time passes, your user priority increases again.
The longer a job waits in the queue, the higher its job priority goes.
So, in the long run (if everyone is submitting an never-ending stream
of jobs), everyone will get exactly their share.&lt;/p&gt;
&lt;p&gt;Once there are priorities, then: jobs are scheduled in order of
priority, then any gaps are backfilled with any smaller jobs that can
fit in.  So small jobs usually get scheduled fast regardless.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Warning: from this point on, we get more and more technical, if you
really want to know the details.  Summary at the end.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;What’s a share?  Currently shares are based on department and their
respective funding of Triton (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;sshare&lt;/span&gt;&lt;/code&gt;).  It used to be that
departments had a share, and then each member had a share of that
department.  But for complex reasons we have changed it so that it’s
flat: so that each person has a share, and the shares of everyone in a
department corresponds to that department’s share.  When you are below
your share (relative to everyone else), you have higher priority, and
vice versa.&lt;/p&gt;
&lt;p&gt;Your priority goes down via the “job billing”: roughly time×power.
CPUs are billed at 1/s (but older, less powerful CPUs cost less!).
Memory costs .2/GB/s.  But: you only get billed for the max of memory
or CPU. So if you use one CPU and all the memory (so that no one else
can run on it), you get billed for all memory but no CPU.  Same for
all CPUs and little memory.  This encourages balanced use.  (this also
applies to GPUs).&lt;/p&gt;
&lt;p&gt;GPUs also have a billing weight, currently tens of times higher than a
CPU billing weight for the newest GPUs.  (In general all of these can
change, for the latest info see search &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;BillingWeights&lt;/span&gt;&lt;/code&gt; in
&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;/etc/slurm/slurm.conf&lt;/span&gt;&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;If you submit a long job but it ends early, you are only billed for
the actual time you use (but the longer job might take longer to start
at the beginning).  Memory is always billed for the full reservation
even if you use less, since it isn’t shared.&lt;/p&gt;
&lt;p&gt;The “user priority” is actually just a record how much you have
consumed lately (the billing numbers above).  This number goes down
with a half-life decay of 2 weeks.  Your personal priority your share
compared to that, so we get the effect described above: the more you
(or your department) runs lately, the lower your priority.&lt;/p&gt;
&lt;p&gt;If you want your stuff to run faster, the best way is to more
accurately specify your time (may make that job can find a place
sooner) and memory (avoids needlessly wasting your priority).&lt;/p&gt;
&lt;p&gt;While your job is pending in the queue SLURM checks those metrics
regularly and recalculates job priority constantly.  If you are
interested in details, take a look at &lt;a class="reference external" href="https://slurm.schedmd.com/priority_multifactor.html"&gt;multifactor priority plugin&lt;/a&gt; page (general
info) and &lt;a class="reference external" href="https://slurm.schedmd.com/priority_multifactor3.html"&gt;depth-oblivious fair-share factor&lt;/a&gt; for what we
use specifically (warning: very in depth page).  On Triton, you can
always see the latest billing weights in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;/etc/slurm/slurm.conf&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Numerically, job priorities range from 0 to 2^32-1.  Higher is
sooner to run, but really the number doesn’t mean much itself.&lt;/p&gt;
&lt;p&gt;These commands can show you information about your user and job
priorities:&lt;/p&gt;
&lt;table class="docutils align-default"&gt;
&lt;tbody&gt;
&lt;tr class="row-odd"&gt;&lt;td&gt;&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;slurm&lt;/span&gt; &lt;span class="pre"&gt;s&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;list of jobs per user with their current priorities&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td&gt;&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;slurm&lt;/span&gt; &lt;span class="pre"&gt;full&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;as above but almost all of the job parameters are listed&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td&gt;&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;slurm&lt;/span&gt; &lt;span class="pre"&gt;shares&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;displays usage (RawUsage) and current FairShare weights (FairShare, higher is better) values for all users&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td&gt;&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;sshare&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;Raw data of the above&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td&gt;&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;sprio&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;Raw priority of queued jobs&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td&gt;&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;slurm&lt;/span&gt; &lt;span class="pre"&gt;j&lt;/span&gt; &lt;span class="pre"&gt;&amp;lt;jobid&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td&gt;&lt;p&gt;shows &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;&amp;lt;jobid&amp;gt;&lt;/span&gt;&lt;/code&gt; detailed info including priority, requested nodes etc.&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/section&gt;
&lt;section id="summary"&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;tl;dr: Just select the resources you think you need, and Slurm
tries to balance things out so everyone gets their share.  The best
way to maintain high priority is to use resources efficiently so you
don’t need to over-request.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/how-busy-is-the-cluster/"/>
    <summary>We occasionally get some questions: how busy is the cluster?  How
long do I have to wait?  Is there some dashboard that can tell me?The answer is, unfortunately, not so easy.  Our cluster uses dynamic scheduling with a fairshare algorithm.
All users have a fairshare priority, which decreases the more you have
recently run.  Jobs are ranked by priority (including fairshare plus
other factors), and scheduled in that order.  If there are
unschedulable holes between those jobs, it can take a job with a lower
priority and fill them in (“backfilling”).  So that gives us:A small-enough job with a low priority might still be scheduled
soon.A higher priority user could submit something while you are waiting,
and increase your wait time.An existing job could end early, making other wait times shorter.An existing job could end early, allowing some other higher priority
jobs to run sooner, making backfilled jobs run later.In short: there is no way to give an estimate of the wait time, in the
way people want.  We’ve tried but haven’t find a way to answer the
question well.What can we know?You can compare your fairshare factor with other users.  If you run
sshare you can see the fairshare (higher means higher priority).
sprio shows relatively priority for all jobs (here, the raw values
are multiplied by some factor and added).  On Triton (the new install
since 2024 may), they mean the following:at 7 days) (zero when first submitted, increasing to 10000 at 7
days old)The fairshare factor is “1e7 × FairShare priority from sshare”The FairShare value is computed based on the raw usage value: at
each level of the share tree, it divides it up among the users so
that those who have run less have a higher priority.The usage value decays with a two-week half life.The others are mostly constant.Still: this is all very abstract and what others submit has more
effect than your priority.  The only thing you can control is using
less resources.This is quite cluster dependent so we’d recommend asking for help for
how your own cluster is setup.This may be your real question. There are two main things:Use less resources.  Make sure you don’t over-request more than
you need (CPU, memory, GPUs) - this will affect your future
fairshare less. Of course, use everything you need, “saving for
later” doesn’t give you more resources than you save now.Request less resources per job.  This will let you be backfilled
into the scheduling holes (see below).In this case, if there is a slot for you, you are scheduled very soon.
srun --test [RESOURCE_REQUESTS] might give you some hint about
when a job would be scheduled - it basically tries to schedule an
empty job and reports the currently estimated start time. (It uses a
JobID though so don’t run it in a loop)In this case, nothing can be said since the queue is always being
re-shuffled.  In the long-run, you get a fair share of resources.  If
you haven’t used much lately, you have more now.  Your wait time
depends more on what other users submit (and their priorities) than
what you submit - and this is always changing.  You can tell something
about how soon you’d be scheduled by looking at your priority relative
to other users.  Make your jobs as small and efficient as possible to
fit in between the holes of other jobs and get scheduled as soon as
possible.  If you can break one big job into smaller pieces (less
time, less CPU, less memory) that depend on each other, then you can
better fit in between all of the big jobs.  See the Tetris metaphor
here in TTT4HPCIf your need is “run stuff quickly for testing”, make sure the jobs
are as short as possible.  Hopefully, your cluster staff about
development or debugging partitions that may be of use, because that’s
the solution for quick tests.This description was in an old version of our docs but has since been
removed.  The exact values are out of date.  It’s included here for
detailed reference anyway:Triton queues are not first-in first-out, but “fairshare”.  This means
that every person has a priority.  The more you run the lower your
user priority.  As time passes, your user priority increases again.
The longer a job waits in the queue, the higher its job priority goes.
So, in the long run (if everyone is submitting an never-ending stream
of jobs), everyone will get exactly their share.Once there are priorities, then: jobs are scheduled in order of
priority, then any gaps are backfilled with any smaller jobs that can
fit in.  So small jobs usually get scheduled fast regardless.Warning: from this point on, we get more and more technical, if you
really want to know the details.  Summary at the end.What’s a share?  Currently shares are based on department and their
respective funding of Triton (sshare).  It used to be that
departments had a share, and then each member had a share of that
department.  But for complex reasons we have changed it so that it’s
flat: so that each person has a share, and the shares of everyone in a
department corresponds to that department’s share.  When you are below
your share (relative to everyone else), you have higher priority, and
vice versa.Your priority goes down via the “job billing”: roughly time×power.
CPUs are billed at 1/s (but older, less powerful CPUs cost less!).
Memory costs .2/GB/s.  But: you only get billed for the max of memory
or CPU. So if you use one CPU and all the memory (so that no one else
can run on it), you get billed for all memory but no CPU.  Same for
all CPUs and little memory.  This encourages balanced use.  (this also
applies to GPUs).GPUs also have a billing weight, currently tens of times higher than a
CPU billing weight for the newest GPUs.  (In general all of these can
change, for the latest info see search BillingWeights in
/etc/slurm/slurm.conf).If you submit a long job but it ends early, you are only billed for
the actual time you use (but the longer job might take longer to start
at the beginning).  Memory is always billed for the full reservation
even if you use less, since it isn’t shared.The “user priority” is actually just a record how much you have
consumed lately (the billing numbers above).  This number goes down
with a half-life decay of 2 weeks.  Your personal priority your share
compared to that, so we get the effect described above: the more you
(or your department) runs lately, the lower your priority.If you want your stuff to run faster, the best way is to more
accurately specify your time (may make that job can find a place
sooner) and memory (avoids needlessly wasting your priority).While your job is pending in the queue SLURM checks those metrics
regularly and recalculates job priority constantly.  If you are
interested in details, take a look at multifactor priority plugin page (general
info) and depth-oblivious fair-share factor for what we
use specifically (warning: very in depth page).  On Triton, you can
always see the latest billing weights in /etc/slurm/slurm.confNumerically, job priorities range from 0 to 2^32-1.  Higher is
sooner to run, but really the number doesn’t mean much itself.These commands can show you information about your user and job
priorities:slurm slist of jobs per user with their current prioritiesslurm fullas above but almost all of the job parameters are listedslurm sharesdisplays usage (RawUsage) and current FairShare weights (FairShare, higher is better) values for all userssshareRaw data of the abovesprioRaw priority of queued jobsslurm j &lt;jobid&gt;shows &lt;jobid&gt; detailed info including priority, requested nodes etc.tl;dr: Just select the resources you think you need, and Slurm
tries to balance things out so everyone gets their share.  The best
way to maintain high priority is to use resources efficiently so you
don’t need to over-request.</summary>
    <published>2024-10-21T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/rse-work-rotations/</id>
    <title>RSE work rotations</title>
    <updated>2024-10-08T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="rse-work-rotations"&gt;

&lt;p&gt;Let’s say you want to start a Research (Software) Engineer team in
your own unit.  How do you set your new hires off on the right
path?  A proposal is outlined below.&lt;/p&gt;
&lt;p&gt;This is a companion post to &lt;a class="reference internal" href="../2024/rse-collaboration/"&gt;&lt;span class="doc"&gt;Future RSE collaboration in Finland&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;section id="getting-started"&gt;
&lt;h2&gt;Getting started&lt;/h2&gt;
&lt;p&gt;You need to find the right person to hire for the role.  Most likely,
this means someone with the skills you need but the mindset to
transition from their own work to making other work possible.  You can
find hiring resources on &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;the Aalto RSE page&lt;/span&gt;&lt;/a&gt; and some brief thoughts in the companion post
&lt;a class="reference internal" href="../2024/rse-collaboration/"&gt;&lt;span class="doc"&gt;Future RSE collaboration in Finland&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let’s say you have hired someone.  What’s next?&lt;/p&gt;
&lt;/section&gt;
&lt;section id="an-apprenticeship-proposal"&gt;
&lt;h2&gt;An apprenticeship proposal&lt;/h2&gt;
&lt;p&gt;This proposal is much easier for someone inside of Aalto University
than outside, but possibly could be negotiated for others.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Your new hire works as part of the existing School of Science RSE
team initially, perhaps ~1 year.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The hire is paid, organizationally supervised in, and sits in your
own unit.  It is absolutely critical that they maintain close
connections to your own unit, the membership in our team is only
virtual.  (Our team is remote-first, so this is easy).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;They focus on projects from your own unit, but as part of our daily
flow.  This could mean asking your audience to join our
&lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SciComp garage&lt;/span&gt;&lt;/a&gt; for help and requesting that new big
projects come via our project management systems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your new hire will learn all about how we work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your new hire will experience a tremendous diversity of projects and
work with experts on them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;After the initial ~1 year period, we sit down and decide what is
next.  Does your new hire stay working as part of our team (with a
greater focus on your own unit’s projects)?  Or do they split off
and start doing their own thing in your unit?  Or some combination?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This gives you the most important part of our onboarding and training.
There is no better way to develop the right mindset.  If we split
later, your staff will know who to ask for harder problems that come
up later.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-s-next"&gt;
&lt;h2&gt;What’s next?&lt;/h2&gt;
&lt;p&gt;If this sounds interesting to you, contact the author of this article
(&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;first.last&amp;#64;aalto.fi&lt;/span&gt;&lt;/code&gt; or various chat systems).&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/rse-work-rotations/"/>
    <summary>Let’s say you want to start a Research (Software) Engineer team in
your own unit.  How do you set your new hires off on the right
path?  A proposal is outlined below.This is a companion post to /2024/rse-collaboration.You need to find the right person to hire for the role.  Most likely,
this means someone with the skills you need but the mindset to
transition from their own work to making other work possible.  You can
find hiring resources on the Aalto RSE page and some brief thoughts in the companion post
/2024/rse-collaboration.Let’s say you have hired someone.  What’s next?This proposal is much easier for someone inside of Aalto University
than outside, but possibly could be negotiated for others.Your new hire works as part of the existing School of Science RSE
team initially, perhaps ~1 year.The hire is paid, organizationally supervised in, and sits in your
own unit.  It is absolutely critical that they maintain close
connections to your own unit, the membership in our team is only
virtual.  (Our team is remote-first, so this is easy).They focus on projects from your own unit, but as part of our daily
flow.  This could mean asking your audience to join our
help/garage for help and requesting that new big
projects come via our project management systems.Your new hire will learn all about how we work.Your new hire will experience a tremendous diversity of projects and
work with experts on them.After the initial ~1 year period, we sit down and decide what is
next.  Does your new hire stay working as part of our team (with a
greater focus on your own unit’s projects)?  Or do they split off
and start doing their own thing in your unit?  Or some combination?This gives you the most important part of our onboarding and training.
There is no better way to develop the right mindset.  If we split
later, your staff will know who to ask for harder problems that come
up later.If this sounds interesting to you, contact the author of this article
(first.last@aalto.fi or various chat systems).</summary>
    <published>2024-10-08T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/rse-collaboration/</id>
    <title>Future RSE collaboration in Finland</title>
    <updated>2024-10-08T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="future-rse-collaboration-in-finland"&gt;

&lt;p&gt;The Aalto University School of Science has a successful
&lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Research Software Engineering service&lt;/span&gt;&lt;/a&gt;
serving the whole university.  This service has proven its value and
there are an increasing number of questions of how others can form
their own teams in Finland and work together.  This post gives some
thoughts on the matter.&lt;/p&gt;
&lt;p&gt;This page is the opinion of the author and not Aalto itself.  It’s not
an open offer for collaboration.  The author is happy to help with any
questions you may have (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;first.last&amp;#64;aalto.fi&lt;/span&gt;&lt;/code&gt; or various chat
systems).&lt;/p&gt;
&lt;section id="what-is-research-software-engineering"&gt;
&lt;h2&gt;What is “research software engineering”?&lt;/h2&gt;
&lt;p&gt;Universities have academics: the traditional core, making ideas and
new results.  Much research, even not “computer science”, needs
computational tools.  However, the skills needed even for basic
computation can be so complex that not all academics can master it to
do cutting-edge research.  A Research (Software) Engineer (RSE) can
bridge that gap: academics focus on their primary work, and the RSE
makes the computing seamless.&lt;/p&gt;
&lt;p&gt;For more info, see the &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Aalto RSE site&lt;/span&gt;&lt;/a&gt;.
This is not that different from research engineers supporting complex
physical equipment.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="rkdarst-s-recommendations-for-rse-team-starters"&gt;
&lt;h2&gt;rkdarst’s recommendations for RSE team starters&lt;/h2&gt;
&lt;p&gt;We’ve found there are plenty of qualified people to hire.  The harder
part is mentoring them to transition from a researcher (focused on
single projects with emphasis on own publications) to supporter
(supporting a wide variety of people with respect and compassion).
This transition needs active mentoring.&lt;/p&gt;
&lt;p&gt;See the companion post about
&lt;a class="reference internal" href="../2024/rse-work-rotations/"&gt;&lt;span class="doc"&gt;work rotations for RSE mentoring&lt;/span&gt;&lt;/a&gt; -
if you are in Aalto University then start there.&lt;/p&gt;
&lt;p&gt;You should decide if you want (a) wide-ranging support which may
include helping with basics or (b) specialist support for a
limited audience.  I would argue that our most important impact is
(a): this has gotten us the most benefit overall, and a steady stream
of more advanced projects as work advances.&lt;/p&gt;
&lt;figure class="align-default" id="id1"&gt;
&lt;img alt="Two by two grid with axes generalist/specialist and wide/local audience" src="https://github.com/AaltoSciComp/aaltoscicomp-graphics/blob/master/figures/rse-types.png?raw=true" /&gt;
&lt;figcaption&gt;
&lt;p&gt;&lt;span class="caption-text"&gt;A possible categorization of research engineers roles.  Where do
you want your hire to fit in here?  (This is not designed to
classify people, it’s designed to plan how people might be
located.)&lt;/span&gt;&lt;/p&gt;
&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/section&gt;
&lt;section id="collaboration-possibilities-no-funding"&gt;
&lt;h2&gt;Collaboration possibilities: no funding&lt;/h2&gt;
&lt;p&gt;Let’s say you want your own RSE team at your own organization.  How
can you and Aalto RSE work together?&lt;/p&gt;
&lt;p&gt;Even without joint funding, some of us Aalto people would be happy to
talk and give some advice, and be part of a broader general network.
For example:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;How our team works, what makes it work, advice for your team&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Joint RSE seminars to build skills, for example as part of
&lt;a class="reference external" href="https://scicomp.aalto.fi/tech/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;FCCI Tech (aka the SciComp Tech series)&lt;/span&gt;&lt;/a&gt;, &lt;a class="reference external" href="https://nordic-rse.org/events/seminar-series/"&gt;Nordic-RSE seminar series&lt;/a&gt;, or something new.
Both of these are good for professional development and community.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All the advice and practices on &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;the Aalto RSE site&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Professional networking and so on.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, without funding, Aalto needs to focus on its own work.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="collaboration-possibilities-with-funding"&gt;
&lt;h2&gt;Collaboration possibilities: with funding&lt;/h2&gt;
&lt;p&gt;With joint funding, it might be possible to make a collaboration.&lt;/p&gt;
&lt;p&gt;Any higher level collaboration needs to be discussed with management.
Assuming these discussions go well, we might join a collaboration
together so that we can actually share projects between the team.
There should always be a strong local presence, because
that gets the best value.  This opens more possibilities.&lt;/p&gt;
&lt;p&gt;The more experienced or larger teams could provide:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Mentoring possibilities for new research engineers and their teams
(see &lt;a class="reference internal" href="../2024/rse-work-rotations/"&gt;&lt;span class="doc"&gt;RSE work rotations&lt;/span&gt;&lt;/a&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A base for professional networking.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A larger base of knowledge, for more advice and help with specialist
problems.  A very important part of our team is that for almost any
problem, someone has seen it and can solve it quickly.  Then we
train others to solve it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Joint support sessions such as our &lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SciComp garage&lt;/span&gt;&lt;/a&gt;,
which allowed a wider support base for problems, covering the
previous point.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The newer or smaller teams could provide:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Funding via some joint project.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;More staff around to help fill in the gaps when needed (these staff
also get training in these projects they experience).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Specialty domain knowledge (both for support of academics and for
professional development).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A collaboration with larger funding could have a joint project flow:
there is one place to submit new projects requests, and the right
people in any organization will work on them.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="lighter-collaboration"&gt;
&lt;h2&gt;Lighter collaboration&lt;/h2&gt;
&lt;p&gt;We would welcome observers in our support sessions, especially from
other staff at Aalto.  The &lt;a class="reference external" href="https://nordic-rse.org/about/getinvolved/"&gt;Nordic-RSE chat&lt;/a&gt; is also a good way to
ask questions and see what we are up to for those outside Aalto
University.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-s-next"&gt;
&lt;h2&gt;What’s next?&lt;/h2&gt;
&lt;p&gt;We know of various opportunities being considered for national
(Finland) or international RSE collaborations.  The above are some
basic thoughts, but any model would be tailored to the actual funding
and partners.  There is definitely a benefit to starting off
together.&lt;/p&gt;
&lt;p&gt;For more information, contact the author at &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;first.last&amp;#64;aalto.fi&lt;/span&gt;&lt;/code&gt;
and read &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;Research Software Engineers&lt;/span&gt;&lt;/a&gt; for more info.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/rse-collaboration/"/>
    <summary>The Aalto University School of Science has a successful
Research Software Engineering service
serving the whole university.  This service has proven its value and
there are an increasing number of questions of how others can form
their own teams in Finland and work together.  This post gives some
thoughts on the matter.This page is the opinion of the author and not Aalto itself.  It’s not
an open offer for collaboration.  The author is happy to help with any
questions you may have (first.last@aalto.fi or various chat
systems).Universities have academics: the traditional core, making ideas and
new results.  Much research, even not “computer science”, needs
computational tools.  However, the skills needed even for basic
computation can be so complex that not all academics can master it to
do cutting-edge research.  A Research (Software) Engineer (RSE) can
bridge that gap: academics focus on their primary work, and the RSE
makes the computing seamless.For more info, see the Aalto RSE site.
This is not that different from research engineers supporting complex
physical equipment.We’ve found there are plenty of qualified people to hire.  The harder
part is mentoring them to transition from a researcher (focused on
single projects with emphasis on own publications) to supporter
(supporting a wide variety of people with respect and compassion).
This transition needs active mentoring.See the companion post about
work rotations for RSE mentoring -
if you are in Aalto University then start there.You should decide if you want (a) wide-ranging support which may
include helping with basics or (b) specialist support for a
limited audience.  I would argue that our most important impact is
(a): this has gotten us the most benefit overall, and a steady stream
of more advanced projects as work advances.Let’s say you want your own RSE team at your own organization.  How
can you and Aalto RSE work together?Even without joint funding, some of us Aalto people would be happy to
talk and give some advice, and be part of a broader general network.
For example:How our team works, what makes it work, advice for your teamJoint RSE seminars to build skills, for example as part of
tech/index, Nordic-RSE seminar series, or something new.
Both of these are good for professional development and community.All the advice and practices on the Aalto RSE site.Professional networking and so on.However, without funding, Aalto needs to focus on its own work.With joint funding, it might be possible to make a collaboration.Any higher level collaboration needs to be discussed with management.
Assuming these discussions go well, we might join a collaboration
together so that we can actually share projects between the team.
There should always be a strong local presence, because
that gets the best value.  This opens more possibilities.The more experienced or larger teams could provide:Mentoring possibilities for new research engineers and their teams
(see /2024/rse-work-rotations).A base for professional networking.A larger base of knowledge, for more advice and help with specialist
problems.  A very important part of our team is that for almost any
problem, someone has seen it and can solve it quickly.  Then we
train others to solve it.Joint support sessions such as our help/garage,
which allowed a wider support base for problems, covering the
previous point.The newer or smaller teams could provide:Funding via some joint project.More staff around to help fill in the gaps when needed (these staff
also get training in these projects they experience).Specialty domain knowledge (both for support of academics and for
professional development).A collaboration with larger funding could have a joint project flow:
there is one place to submit new projects requests, and the right
people in any organization will work on them.We would welcome observers in our support sessions, especially from
other staff at Aalto.  The Nordic-RSE chat is also a good way to
ask questions and see what we are up to for those outside Aalto
University.We know of various opportunities being considered for national
(Finland) or international RSE collaborations.  The above are some
basic thoughts, but any model would be tailored to the actual funding
and partners.  There is definitely a benefit to starting off
together.For more information, contact the author at first.last@aalto.fi
and read rse/index for more info.</summary>
    <published>2024-10-08T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/triton-v3-is-now-default/</id>
    <title>Triton v3 is now default</title>
    <updated>2024-05-06T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="triton-v3-is-now-default"&gt;

&lt;p&gt;Triton has a major update.  You can read our previous info about this at
&lt;a class="reference internal" href="../2023/preparing-for-new-triton/"&gt;&lt;span class="doc"&gt;Preparing for new Triton&lt;/span&gt;&lt;/a&gt;, and our “what has changed” in
&lt;a class="reference external" href="https://version.aalto.fi/gitlab/AaltoScienceIT/triton/-/issues/1593"&gt;Triton issue #1593&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You might get &lt;a class="reference internal" href="../2024/ssh-host-key-warnings/"&gt;&lt;span class="doc"&gt;SSH host key warnings&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;section id="what-is-triton-v3"&gt;
&lt;h2&gt;What is Triton v3&lt;/h2&gt;
&lt;p&gt;It has the same name, and importantly the same user accounts and data,
but all the software and operating system is changed.  In particular:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;All software modules are different&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any software which has been complied will need to be re-compiled.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="why-and-why-now"&gt;
&lt;h2&gt;Why, and why now?&lt;/h2&gt;
&lt;p&gt;Triton’s previous operating system was released in 2014.  Security
support runs out at the end of 2024 May, and it &lt;em&gt;has&lt;/em&gt; to be updated.
Stability is good for research, so we try to reduce the number of
changes (compare)&lt;/p&gt;
&lt;p&gt;We realize that a change is very disruptive and painful, especially
since the expectation is that Triton never changes.  But an old
operating system makes problem for users too, and they have gotten
more and more over the years.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-to-do"&gt;
&lt;h2&gt;What to do&lt;/h2&gt;
&lt;p&gt;Most of the transition for different types of software is described in
&lt;a class="reference external" href="https://version.aalto.fi/gitlab/AaltoScienceIT/triton/-/issues/1593"&gt;Triton issue #1593&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/triton-v3-is-now-default/"/>
    <summary>Triton has a major update.  You can read our previous info about this at
/2023/preparing-for-new-triton, and our “what has changed” in
Triton issue #1593.You might get SSH host key warnings.It has the same name, and importantly the same user accounts and data,
but all the software and operating system is changed.  In particular:All software modules are differentAny software which has been complied will need to be re-compiled.Triton’s previous operating system was released in 2014.  Security
support runs out at the end of 2024 May, and it has to be updated.
Stability is good for research, so we try to reduce the number of
changes (compare)We realize that a change is very disruptive and painful, especially
since the expectation is that Triton never changes.  But an old
operating system makes problem for users too, and they have gotten
more and more over the years.Most of the transition for different types of software is described in
Triton issue #1593.</summary>
    <published>2024-05-06T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/ssh-host-key-warnings/</id>
    <title>Triton v3 SSH host key warnings</title>
    <updated>2024-05-06T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="triton-v3-ssh-host-key-warnings"&gt;

&lt;p&gt;When updating Triton, many users will get a message like this (or
similar things if you use other SSH clients like PuTTY):&lt;/p&gt;
&lt;div class="highlight-default notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;section id="what-it-means"&gt;
&lt;h2&gt;What it means&lt;/h2&gt;
&lt;p&gt;SSH (Secure SHell) is made to be secure, and that means one it
verifies the server you are connecting &lt;strong&gt;to&lt;/strong&gt; via its &lt;strong&gt;ssh host
key&lt;/strong&gt;.  The representation of this key is the &lt;strong&gt;fingerprint&lt;/strong&gt;, like
&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;SHA256:OqCehC2lbHdl8mYGI/G9vlxTwew3H3KrvxKDkwIQy9Y&lt;/span&gt;&lt;/code&gt;.  This means
that the NSA or someone can’t intercept the connecting and get your
password by pretending to be Triton.  This is a good thing.&lt;/p&gt;
&lt;p&gt;OpenSSH (the command line program on Linux, MacOS, Windows) saves
these connection IDs (fingerprints) in
&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;$HOME/.ssh/known_hosts&lt;/span&gt;&lt;/code&gt;.  Other programs may store the keys
somewhere else.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-you-should-do-when-you-see-the-warning"&gt;
&lt;h2&gt;What you should do when you see the warning&lt;/h2&gt;
&lt;p&gt;The warning looks scary but the first thing to ask is “should the
server I am connecting to have changed?”.  If you have been directed
to this blog post, then probably yes, it has.  You should &lt;em&gt;always&lt;/em&gt;
think if the fingerprint should change, and if there is no reason for
them to have changed, contact your administrators.  You can usually
verify the keys online, for example
&lt;a class="reference external" href="https://scicomp.aalto.fi/triton/usage/ssh-fingerprints/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;Triton ssh key fingerprints&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you are on command line OpenSSH (Linux), it will propose a command
that will remove the old host key:&lt;/p&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;ssh-keygen&lt;span class="w"&gt; &lt;/span&gt;-R&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;triton.aalto.fi&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;For other programs, follow whatever prompts it might give to replace
the host key fingerprint.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-are-the-current-ssh-keys"&gt;
&lt;h2&gt;What are the current SSH keys?&lt;/h2&gt;
&lt;p&gt;When you get a “The authenticity of host ‘triton.aalto.fi’ can’t be
established”, verify the SSH key fingerprints that are presented, then
click “yes” to permanently save them (until they change next, they can
always be updated).  The fingerprints for Triton v3 are:&lt;/p&gt;
&lt;div class="highlight-default notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="mi"&gt;3072&lt;/span&gt; &lt;span class="n"&gt;SHA256&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;u8iICwjmvJ&lt;/span&gt;&lt;span class="o"&gt;/+&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="n"&gt;YGxqqK&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;r7FmrDflcgpoGl5ygtAWw&lt;/span&gt; &lt;span class="n"&gt;login4&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;triton&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aalto&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RSA&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;256&lt;/span&gt; &lt;span class="n"&gt;SHA256&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;OqCehC2lbHdl8mYGI&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;G9vlxTwew3H3KrvxKDkwIQy9Y&lt;/span&gt; &lt;span class="n"&gt;login4&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;triton&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aalto&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ECDSA&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;256&lt;/span&gt; &lt;span class="n"&gt;SHA256&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;ibL4dBsdrwRjbJCBWL1J5p&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Sg4PGHWxTG6HF65yPcps&lt;/span&gt; &lt;span class="n"&gt;login4&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;triton&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aalto&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fi&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ED25519&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/ssh-host-key-warnings/"/>
    <summary>When updating Triton, many users will get a message like this (or
similar things if you use other SSH clients like PuTTY):SSH (Secure SHell) is made to be secure, and that means one it
verifies the server you are connecting to via its ssh host
key.  The representation of this key is the fingerprint, like
SHA256:OqCehC2lbHdl8mYGI/G9vlxTwew3H3KrvxKDkwIQy9Y.  This means
that the NSA or someone can’t intercept the connecting and get your
password by pretending to be Triton.  This is a good thing.OpenSSH (the command line program on Linux, MacOS, Windows) saves
these connection IDs (fingerprints) in
$HOME/.ssh/known_hosts.  Other programs may store the keys
somewhere else.The warning looks scary but the first thing to ask is “should the
server I am connecting to have changed?”.  If you have been directed
to this blog post, then probably yes, it has.  You should always
think if the fingerprint should change, and if there is no reason for
them to have changed, contact your administrators.  You can usually
verify the keys online, for example
triton/usage/ssh-fingerprints.If you are on command line OpenSSH (Linux), it will propose a command
that will remove the old host key:For other programs, follow whatever prompts it might give to replace
the host key fingerprint.When you get a “The authenticity of host ‘triton.aalto.fi’ can’t be
established”, verify the SSH key fingerprints that are presented, then
click “yes” to permanently save them (until they change next, they can
always be updated).  The fingerprints for Triton v3 are:</summary>
    <published>2024-05-06T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/rse-funding/</id>
    <title>Research Software Engineer project funding: what’s been working</title>
    <updated>2024-02-21T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="research-software-engineer-project-funding-what-s-been-working"&gt;

&lt;p&gt;The “Research Software Engineer” service provides technical
collaborators for researchers to complement their scientific
knowledge.  &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Read about Aalto RSE here&lt;/span&gt;&lt;/a&gt;.
The idea of this service was that it would be available to everyone,
but some projects who made extensive use would fund it themselves.&lt;/p&gt;
&lt;p&gt;If you are a group leader reading this, Aalto RSE can help you release
research code, debug it, make it reusable, rescue old code from former
members and make it usable again, make it run on our cluster or CSC’s
clusters, manage data, prepare data for easy use, and so on.  If it’s
not long (less than a month), our work is free, if it’s more than a
month, the below applies.&lt;/p&gt;
&lt;section id="early-project-funding"&gt;
&lt;h2&gt;Early project funding&lt;/h2&gt;
&lt;p&gt;When we started, we hoped for around 50% project funding.  The idea
was that a lot of the funding for the service would come from the
research projects themselves.  This hasn’t really worked out so well,
because a) we accomplish the vast majority of our projects quickly, in
less than a few weeks, and b) finance would understandably not like to
deal with small transactions for small amounts of time.&lt;/p&gt;
&lt;p&gt;What actually happened was that we basically have received only a small amount of the
project funding we would have wanted.  On the other hand, this also
means we have supported a far wider variety of projects than we would
have otherwise.  It also means we are better accomplishing our other
goal: tactical support right where and when it’s needed most, with the
least amount of administrative overhead.  This actually better matches
our mission of helping the researchers who need us most.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="future-funding-prospects"&gt;
&lt;h2&gt;Future funding prospects&lt;/h2&gt;
&lt;p&gt;For any &lt;strong&gt;long projects&lt;/strong&gt; (more than a month or so), we still follow
do our original plan: we can receive funding from grants (or basic
funding) to do long-term projects.  This is usually 40-80% of a RSE’s
time, spread out over more than a month (and it can also be bursty:
lots of work at some times, waiting for the next task at other times).
We have done this for projects, and we know we can do it in the
future.&lt;/p&gt;
&lt;p&gt;But there’s another thing that has worked well: &lt;strong&gt;retainer&lt;/strong&gt;-type
funding instead of project-based funding.  You have extra funding that
needs to be used?  You know your group needs support but you can’t
name a single specific project to use all the time?  Hire RSEs on
long-term retainers and we’re there for you as needed.  &lt;strong&gt;You will
always get priority for all the quick questions you have&lt;/strong&gt; (in
&lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SciComp garage&lt;/span&gt;&lt;/a&gt; or otherwise), and you get the highest
priority for your medium projects, we can attend your other group
meetings, and so on.  As your team wants, we’ll make high-impact
improvements here and there.  This could be (for example) anywhere
from 10-40% time over a long period.&lt;/p&gt;
&lt;p&gt;We &lt;em&gt;have&lt;/em&gt; worked out how to do both one-off projects and retainers
with Finance.  As for as external funders are concerned, our staff
count as researchers, so we can use any funding you might have.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If you think you have a project or want us on retainer, let us
know:&lt;/strong&gt; &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;Research Software Engineers&lt;/span&gt;&lt;/a&gt; or &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/for-researchers/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;For researchers and research groups&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="why-are-our-projects-so-short"&gt;
&lt;h2&gt;Why are our projects so short?&lt;/h2&gt;
&lt;p&gt;This is a valid questions.  Compared to many RSE groups, we seem to be
focusing on many small questions for a broad audience that knows a lot
about the problems they need to solve.  Thus, we can come in to
something already set up well, provide help, and mostly back off and
be available for maintenance long-term.  The units that fund us
(schools, departments) have been happy with this, so we’ve kept it up.
On the other hand, we are pretty fast.  There have been projects where
a summer worker was going to be hired, that we could end up doing
(learning the framework + all the main tasks) in two weeks.  The way
we work together as a team also makes things quite fast.  Thus, a
project has to be quite deep in order to exceed a month of work.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/rse-funding/"/>
    <summary>The “Research Software Engineer” service provides technical
collaborators for researchers to complement their scientific
knowledge.  Read about Aalto RSE here.
The idea of this service was that it would be available to everyone,
but some projects who made extensive use would fund it themselves.If you are a group leader reading this, Aalto RSE can help you release
research code, debug it, make it reusable, rescue old code from former
members and make it usable again, make it run on our cluster or CSC’s
clusters, manage data, prepare data for easy use, and so on.  If it’s
not long (less than a month), our work is free, if it’s more than a
month, the below applies.When we started, we hoped for around 50% project funding.  The idea
was that a lot of the funding for the service would come from the
research projects themselves.  This hasn’t really worked out so well,
because a) we accomplish the vast majority of our projects quickly, in
less than a few weeks, and b) finance would understandably not like to
deal with small transactions for small amounts of time.What actually happened was that we basically have received only a small amount of the
project funding we would have wanted.  On the other hand, this also
means we have supported a far wider variety of projects than we would
have otherwise.  It also means we are better accomplishing our other
goal: tactical support right where and when it’s needed most, with the
least amount of administrative overhead.  This actually better matches
our mission of helping the researchers who need us most.For any long projects (more than a month or so), we still follow
do our original plan: we can receive funding from grants (or basic
funding) to do long-term projects.  This is usually 40-80% of a RSE’s
time, spread out over more than a month (and it can also be bursty:
lots of work at some times, waiting for the next task at other times).
We have done this for projects, and we know we can do it in the
future.But there’s another thing that has worked well: retainer-type
funding instead of project-based funding.  You have extra funding that
needs to be used?  You know your group needs support but you can’t
name a single specific project to use all the time?  Hire RSEs on
long-term retainers and we’re there for you as needed.  You will
always get priority for all the quick questions you have (in
help/garage or otherwise), and you get the highest
priority for your medium projects, we can attend your other group
meetings, and so on.  As your team wants, we’ll make high-impact
improvements here and there.  This could be (for example) anywhere
from 10-40% time over a long period.We have worked out how to do both one-off projects and retainers
with Finance.  As for as external funders are concerned, our staff
count as researchers, so we can use any funding you might have.If you think you have a project or want us on retainer, let us
know: rse/index or rse/for-researchers.This is a valid questions.  Compared to many RSE groups, we seem to be
focusing on many small questions for a broad audience that knows a lot
about the problems they need to solve.  Thus, we can come in to
something already set up well, provide help, and mostly back off and
be available for maintenance long-term.  The units that fund us
(schools, departments) have been happy with this, so we’ve kept it up.
On the other hand, we are pretty fast.  There have been projects where
a summer worker was going to be hired, that we could end up doing
(learning the framework + all the main tasks) in two weeks.  The way
we work together as a team also makes things quite fast.  Thus, a
project has to be quite deep in order to exceed a month of work.</summary>
    <published>2024-02-21T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2024/kickstart-2023-wrap-up/</id>
    <title>Kickstart 2023 wrap-up and thoughts for the future</title>
    <updated>2024-02-15T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="kickstart-2023-wrap-up-and-thoughts-for-the-future"&gt;

&lt;p&gt;Our &lt;a class="reference external" href="https://scicomp.aalto.fi/training/scip/kickstart-2023/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;kickstart course&lt;/span&gt;&lt;/a&gt; came and
went with very few problems.  This post summarizes our general
thoughts on the course and its format.&lt;/p&gt;
&lt;p&gt;If you want to join the course next year (as an attendee, or as an
organization who will send your learners to us (and maybe co-teach)
&lt;a class="reference external" href="https://fosstodon.org/&amp;#64;SciCompAalto"&gt;follow us on Mastodon&lt;/a&gt;.  This
is the third year we’ve done the livestream format, and it’s not
likely to stop anytime soon.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This was originally written in June 2023 but publication was
forgotten until 2024.&lt;/em&gt;&lt;/p&gt;
&lt;section id="history-of-the-course"&gt;
&lt;h2&gt;History of the course&lt;/h2&gt;
&lt;p&gt;The course has run since around 2015 or so.  Until mid 2020, it was always
in-person only.  Until (and including) 2022, it ran twice a year,
January and June, but now it runs only in June (increased availability
of videos + the material compensates).  It runs in June so that it
aligns with new summer research interns starting.  Until around 2020,
it was mostly about using the &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;HPC cluster at Aalto
University&lt;/span&gt;&lt;/a&gt;, but since then there has been more
emphasis on day 1 covering generic skills needed for scientific
computing and the big picture of things.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="general-feedback"&gt;
&lt;h2&gt;General feedback&lt;/h2&gt;
&lt;p&gt;Our general feedback remains quite positive.  Our streaming +
coteaching + collaborative notes format is still well received, and
there seems to be little reason to go back for courses of smaller
scale.  Instead of just lectures, written material (tutorials in info
on scicomp.aalto.fi) + livestream + videos is a good combination.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="not-enough-time"&gt;
&lt;h2&gt;Not enough time&lt;/h2&gt;
&lt;p&gt;There is never enough time - not much else to say.  Each year there is
a different trade-off between how much we cover and how brief we are.
(There are always people who say we should go more in-depth, and some
who say we go too much in-depth.  Such is life.)&lt;/p&gt;
&lt;/section&gt;
&lt;section id="reduce-repetition"&gt;
&lt;h2&gt;Reduce repetition&lt;/h2&gt;
&lt;p&gt;Repetition is good, but not when it’s a sign that we can’t stop
talking and keep saying the same thing over and over.  The best
lessons seemed to be the ones that were taught most quickly, since it
has a high density of new information.  We should strive to make more
lessons faster, and leave details to the reading.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="integrated-support-and-teaching"&gt;
&lt;h2&gt;Integrated support and teaching&lt;/h2&gt;
&lt;p&gt;Because the teachers also do support, for anything difficult, we can
easily tell learners: “Do what you can, come by our
&lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/#garage" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-ref"&gt;SciComp garage&lt;/span&gt;&lt;/a&gt; to ask for help with anything
else.  This overall reduces the demands from teaching: a person
doesn’t have to know everything, but know enough to get started and to
know when they may need more help for more advanced tools.  This
really is good for both of us.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="linux-shell-and-other-prerequisites"&gt;
&lt;h2&gt;Linux shell and other prerequisites&lt;/h2&gt;
&lt;p&gt;As usual, we expected our learners to read our &lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/shell/"&gt;shell crash course&lt;/a&gt; in advance.  We also had
a new tutorial on &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/tut/cluster-shell/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;using the cluster from the shell&lt;/span&gt;&lt;/a&gt;.  This helped some, but it was still a
problem.&lt;/p&gt;
&lt;p&gt;Reflection: this will always be a problem in any course that has a
wide enough audience.  We should accept and provide positive support
for those not ready, and not try to exclude them.  It’s OK to see a
course and then strive to get the prerequisites later.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="should-the-course-be-divided-into-two"&gt;
&lt;h2&gt;Should the course be divided into two?&lt;/h2&gt;
&lt;p&gt;Internally, we had this thought of dividing the course in two: a basic
part at the start of the summer, and an advanced part at the end of
the summer -
since brand new researchers may have trouble understanding
everything.  On the other hand, the fact we have videos means that
people can come back and review the material when they are ready.  So
in some sense, learners can divide the course however they would like
by stopping when they think it’s no longer necessary and coming back.
This could be mentioned more explicitly in our introductions.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="attendance"&gt;
&lt;h2&gt;Attendance&lt;/h2&gt;
&lt;p&gt;Attendance goes down day-by-day.  This is definitely OK - it doesn’t
hurt anyone.  It’s expected that day 1 was suitable for the most
people (even those not doing HPC work), and then the course topics got
continually more specific as we went further and further.&lt;/p&gt;
&lt;p&gt;As mentioned above, this is even be expected and encouraged - better
to have someone attend day 1, than not.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="exercise-level"&gt;
&lt;h2&gt;Exercise level&lt;/h2&gt;
&lt;p&gt;Our exercises are quite basic overall, but we got few complains about
this.  Basic exercises are better than something too advanced or
realistic, that requires many things to come together.&lt;/p&gt;
&lt;p&gt;This year, we tried to have a complete solution for every exercise
(script and/or commands), even if it’s directly said above in the
lesson.  This seemed to be good, since for people very short of time,
they still have some chance to copy and paste and do the exercises.
For those passively following, they can at least see what would have
been done.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="other-feedback-from-the-notes"&gt;
&lt;h2&gt;Other feedback from the notes&lt;/h2&gt;
&lt;p&gt;Day 3 / end of course feedback positive feedback (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;o&lt;/span&gt;&lt;/code&gt; is the way a
person votes for/agrees with that option:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;it’s great that the material is so easily accessible also after the
course to go through things in my own pace again oo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Really good format with the streaming and the shared document for
questions. ooooo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The cat kept me focused in the lecture&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Live interaction with the instructes were very helpful and exercises were nice&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I really appreaciate the instructors took the time to explain the
jargons, instead of just letting them fly around. o&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The fact that the instructors were really nice contributed to the
good course experience. Thanks for that! o&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;(day 1) After studying remotely for 1,5 year and having lots of
online classes, I highly appreciate the amazing audio quality
here. Many thanks for that!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;(day 1) The framework is better than any other workshop I’ve ever
attended - in terms of interaction and audio quality. HackMD is
great.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;(day 1) The (twitch) vertical screen thing is genius and should be
used in way more (online) lectures o&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most common negative feedback: not enough time! In fact, that’s
almost only thing to improve. Except we can’t, so I think we win
pretty well. And videos/material allows follow-up.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="see-also"&gt;
&lt;h2&gt;See also&lt;/h2&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://scicomp.aalto.fi/training/scip/kickstart-2023/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Summer kickstart&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://www.youtube.com/watch?v=gi_zHFPgpfw&amp;amp;list=PLZLVmS9rf3nN1Rj-TAqFEzFM22Y1kJmvn"&gt;How we did summer kickstart 2021&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2024/kickstart-2023-wrap-up/"/>
    <summary>Our kickstart course came and
went with very few problems.  This post summarizes our general
thoughts on the course and its format.If you want to join the course next year (as an attendee, or as an
organization who will send your learners to us (and maybe co-teach)
follow us on Mastodon.  This
is the third year we’ve done the livestream format, and it’s not
likely to stop anytime soon.This was originally written in June 2023 but publication was
forgotten until 2024.The course has run since around 2015 or so.  Until mid 2020, it was always
in-person only.  Until (and including) 2022, it ran twice a year,
January and June, but now it runs only in June (increased availability
of videos + the material compensates).  It runs in June so that it
aligns with new summer research interns starting.  Until around 2020,
it was mostly about using the HPC cluster at Aalto
University, but since then there has been more
emphasis on day 1 covering generic skills needed for scientific
computing and the big picture of things.Our general feedback remains quite positive.  Our streaming +
coteaching + collaborative notes format is still well received, and
there seems to be little reason to go back for courses of smaller
scale.  Instead of just lectures, written material (tutorials in info
on scicomp.aalto.fi) + livestream + videos is a good combination.There is never enough time - not much else to say.  Each year there is
a different trade-off between how much we cover and how brief we are.
(There are always people who say we should go more in-depth, and some
who say we go too much in-depth.  Such is life.)Repetition is good, but not when it’s a sign that we can’t stop
talking and keep saying the same thing over and over.  The best
lessons seemed to be the ones that were taught most quickly, since it
has a high density of new information.  We should strive to make more
lessons faster, and leave details to the reading.Because the teachers also do support, for anything difficult, we can
easily tell learners: “Do what you can, come by our
SciComp garage to ask for help with anything
else.  This overall reduces the demands from teaching: a person
doesn’t have to know everything, but know enough to get started and to
know when they may need more help for more advanced tools.  This
really is good for both of us.As usual, we expected our learners to read our shell crash course in advance.  We also had
a new tutorial on using the cluster from the shell.  This helped some, but it was still a
problem.Reflection: this will always be a problem in any course that has a
wide enough audience.  We should accept and provide positive support
for those not ready, and not try to exclude them.  It’s OK to see a
course and then strive to get the prerequisites later.Internally, we had this thought of dividing the course in two: a basic
part at the start of the summer, and an advanced part at the end of
the summer -
since brand new researchers may have trouble understanding
everything.  On the other hand, the fact we have videos means that
people can come back and review the material when they are ready.  So
in some sense, learners can divide the course however they would like
by stopping when they think it’s no longer necessary and coming back.
This could be mentioned more explicitly in our introductions.Attendance goes down day-by-day.  This is definitely OK - it doesn’t
hurt anyone.  It’s expected that day 1 was suitable for the most
people (even those not doing HPC work), and then the course topics got
continually more specific as we went further and further.As mentioned above, this is even be expected and encouraged - better
to have someone attend day 1, than not.Our exercises are quite basic overall, but we got few complains about
this.  Basic exercises are better than something too advanced or
realistic, that requires many things to come together.This year, we tried to have a complete solution for every exercise
(script and/or commands), even if it’s directly said above in the
lesson.  This seemed to be good, since for people very short of time,
they still have some chance to copy and paste and do the exercises.
For those passively following, they can at least see what would have
been done.Day 3 / end of course feedback positive feedback (o is the way a
person votes for/agrees with that option:it’s great that the material is so easily accessible also after the
course to go through things in my own pace again ooReally good format with the streaming and the shared document for
questions. oooooThe cat kept me focused in the lectureLive interaction with the instructes were very helpful and exercises were niceI really appreaciate the instructors took the time to explain the
jargons, instead of just letting them fly around. oThe fact that the instructors were really nice contributed to the
good course experience. Thanks for that! o(day 1) After studying remotely for 1,5 year and having lots of
online classes, I highly appreciate the amazing audio quality
here. Many thanks for that!(day 1) The framework is better than any other workshop I’ve ever
attended - in terms of interaction and audio quality. HackMD is
great.(day 1) The (twitch) vertical screen thing is genius and should be
used in way more (online) lectures oMost common negative feedback: not enough time! In fact, that’s
almost only thing to improve. Except we can’t, so I think we win
pretty well. And videos/material allows follow-up.Summer kickstartHow we did summer kickstart 2021</summary>
    <published>2024-02-15T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/dev-day-2023-aug/</id>
    <title>ASC development day, 2023 August</title>
    <updated>2023-10-30T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="asc-development-day-2023-august"&gt;

&lt;p&gt;We had another development day (previous:
&lt;a class="reference internal" href="../2023/march-development-day/"&gt;&lt;span class="doc"&gt;ASC development day, 2023 March&lt;/span&gt;&lt;/a&gt;).  It went mostly like the last one, and
we have less important news for the world, but below is the summary
anyway.&lt;/p&gt;
&lt;section id="stats"&gt;
&lt;h2&gt;Stats&lt;/h2&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;We have about 1550 people with accounts, with 202 new account
requests in the last six months.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most routine issues tend to be about software installation, which is
good (this is the actually hard part, it’s good people ask us).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We are still on track for about 500 garage visits per year.  We
don’t try too hard to keep track of them all, we might get about 75%
of them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The number of &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/tut/interactive/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;interactive&lt;/span&gt;&lt;/a&gt;
and &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/apps/jupyter/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Jupyter&lt;/span&gt;&lt;/a&gt; users are increasing, while
&lt;a class="reference external" href="https://scicomp.aalto.fi/triton/usage/ood/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Open OnDemand&lt;/span&gt;&lt;/a&gt; is decreasing.  This
is the wrong direction from what we’d like.  We will open
OOD to connections from all of Finland to make this easier.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="triton-v3"&gt;
&lt;h2&gt;Triton v3&lt;/h2&gt;
&lt;p&gt;Triton v3 is still on the way.  This isn’t a new cluster, but a new
operating system which individual nodes will be migrated to slowly
(while maintaining the same accounts and data).  Most of this happens
in the background, but the change of base operating system images will
require most code to be recompiled, which will require attention
from many users.  The transition can be made slowly, both old and new
OSs will run for a time being.  There won’t be a change in total
amount of computing power.&lt;/p&gt;
&lt;p&gt;An upcoming blog post will discuss this more, and the effects on
users.  &lt;em&gt;Now is the time to start preparing.&lt;/em&gt;  We still expect the
transition to happen sometime in the autumn.&lt;/p&gt;
&lt;p&gt;We are thinking to merge home and scratch directories, to make a
common quota for both.  This would improve usability by reducing the
frequency of home quota affecting usage.  We’d welcome any other
usability suggestions.&lt;/p&gt;
&lt;p&gt;Practically, we are using the chance to automate things even more,
which should make it easier to manage in the future.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="teaching"&gt;
&lt;h2&gt;Teaching&lt;/h2&gt;
&lt;p&gt;Teaching has gone well.  For this academic year, we’d like to add back
in a few smaller, special-purpose courses (not just to teach them, but
also to get good quality video recordings for the future).&lt;/p&gt;
&lt;p&gt;Goals:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Developing and delivering the “&lt;a class="reference external" href="https://scicomp.aalto.fi/training/scip/ttt4hpc-2024/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;workflows&lt;/span&gt;&lt;/a&gt;” course with CodeRefinery&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Short courses to record (e.g. rerun of debug series, once a week,
record and publish).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Update &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/usage/debugging/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;Debugging&lt;/span&gt;&lt;/a&gt; linking the
different debugging course repositories.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="lumi"&gt;
&lt;h2&gt;LUMI&lt;/h2&gt;
&lt;p&gt;LUMI is the new EU cluster with plentyful GPU resources.  A user can
essentially get as many GPU resources as they need with no waiting,
but since the GPUs are AMD, there is some initial barrier.  Our
general feeling remains: “we won’t recommend our users directly go and
use LUMI, but we recommend they talk with us first and we help them
use it”.&lt;/p&gt;
&lt;p&gt;Next steps:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Continue encouraging users to contact us.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;RSEs will ask the top GPU user each week if they would like support
with taking LUMI into use.  We’ll go and do all the setup for them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Slide on infoscreens around the buildings?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/dev-day-2023-aug/"/>
    <summary>We had another development day (previous:
/2023/march-development-day).  It went mostly like the last one, and
we have less important news for the world, but below is the summary
anyway.We have about 1550 people with accounts, with 202 new account
requests in the last six months.Most routine issues tend to be about software installation, which is
good (this is the actually hard part, it’s good people ask us).We are still on track for about 500 garage visits per year.  We
don’t try too hard to keep track of them all, we might get about 75%
of them.The number of interactive
and Jupyter users are increasing, while
Open OnDemand is decreasing.  This
is the wrong direction from what we’d like.  We will open
OOD to connections from all of Finland to make this easier.Triton v3 is still on the way.  This isn’t a new cluster, but a new
operating system which individual nodes will be migrated to slowly
(while maintaining the same accounts and data).  Most of this happens
in the background, but the change of base operating system images will
require most code to be recompiled, which will require attention
from many users.  The transition can be made slowly, both old and new
OSs will run for a time being.  There won’t be a change in total
amount of computing power.An upcoming blog post will discuss this more, and the effects on
users.  Now is the time to start preparing.  We still expect the
transition to happen sometime in the autumn.We are thinking to merge home and scratch directories, to make a
common quota for both.  This would improve usability by reducing the
frequency of home quota affecting usage.  We’d welcome any other
usability suggestions.Practically, we are using the chance to automate things even more,
which should make it easier to manage in the future.Teaching has gone well.  For this academic year, we’d like to add back
in a few smaller, special-purpose courses (not just to teach them, but
also to get good quality video recordings for the future).Goals:Developing and delivering the “workflows” course with CodeRefineryShort courses to record (e.g. rerun of debug series, once a week,
record and publish).Update triton/usage/debugging linking the
different debugging course repositories.LUMI is the new EU cluster with plentyful GPU resources.  A user can
essentially get as many GPU resources as they need with no waiting,
but since the GPUs are AMD, there is some initial barrier.  Our
general feeling remains: “we won’t recommend our users directly go and
use LUMI, but we recommend they talk with us first and we help them
use it”.Next steps:Continue encouraging users to contact us.RSEs will ask the top GPU user each week if they would like support
with taking LUMI into use.  We’ll go and do all the setup for them.Slide on infoscreens around the buildings?</summary>
    <published>2023-10-30T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/webp-security-vulnerability/</id>
    <title>libwebp security vulnerability and computational scientists</title>
    <updated>2023-09-28T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="libwebp-security-vulnerability-and-computational-scientists"&gt;

&lt;p&gt;Recently, a &lt;a class="reference external" href="https://blog.isosceles.com/the-webp-0day/"&gt;major security vulnerability (CVE-2023-5129)&lt;/a&gt; has been
found in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;libwebp&lt;/span&gt;&lt;/code&gt;, an image decoding library for the &lt;cite&gt;.webp&lt;/cite&gt; format.
This is major, since this library is embedded in many apps and web
browsers and allows remote code execution just by opening a file.  For
computational scientists, there is still some impact - and it’s harder
to compensate for.  In short, just by processing an image in the .webp
format, someone can take over your computer or session.&lt;/p&gt;
&lt;p&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;libwebp&lt;/span&gt;&lt;/code&gt; is the current issue, but the problem is general:
&lt;strong&gt;computational scientists often create software environments and use
them for a long
time.  These environments aren’t usually browsing the web (the most
likely attack vector here), but they do involve lots of code installed
from different projects.  How does one manage security in this case?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This post may be updated&lt;/em&gt;&lt;/p&gt;
&lt;section id="how-it-affects-scientists"&gt;
&lt;h2&gt;How it affects scientists&lt;/h2&gt;
&lt;p&gt;If you use web browsers or apps on your own desktops, laptops, phones,
etc. - make sure update them!&lt;/p&gt;
&lt;p&gt;If you don’t use images in your research, there probably isn’t much
impact.&lt;/p&gt;
&lt;p&gt;If you &lt;em&gt;do&lt;/em&gt;, this is what could happen:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;You make a Python / Anaconda environment which uses &lt;cite&gt;libwebp&lt;/cite&gt;
somehow - directly installed through Conda, or some other
application.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You download a dataset containing images.  You process them as part
of your research with the old environment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The malicious image runs an exploit.  It has access to your whole
user account on that computer: extract any data, add SSH keys for
remote access, corrupt/delete data (which might not be backed up
from the cluster…).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Many things have to happen here, but it’s very possible for it to
happen.  You could &lt;strong&gt;lose access to non-backed up data or code&lt;/strong&gt; or
&lt;strong&gt;other confidential or sensitive data could be compromised&lt;/strong&gt;, since
code from one project from your user account has access to all
projects from your account.&lt;/p&gt;
&lt;p&gt;One would normally fix things by updating software.  But when you are
dealing with a research environment that can’t easily be updated, what
should you do?  This is the real question here.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-to-do"&gt;
&lt;h2&gt;What to do&lt;/h2&gt;
&lt;p&gt;It’s a multi-layered problem, and the answer will depend on your
work.  &lt;strong&gt;libwebp is what we are thinking about now, but the problem is
general: there are other security problems that occasionally come up
that can affect more scientific code.  How do you prepare for next time?&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Update your environments (conda, virtualenv, etc).  You could try to
see if &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;libwebp&lt;/span&gt;&lt;/code&gt; is inside of them (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;conda&lt;/span&gt; &lt;span class="pre"&gt;list&lt;/span&gt; &lt;span class="pre"&gt;|&lt;/span&gt; &lt;span class="pre"&gt;grep&lt;/span&gt; &lt;span class="pre"&gt;webp&lt;/span&gt;&lt;/code&gt;),
but especially for Pip packages it might not be apparent.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Make your environments reproducible: If you define your dependencies
in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;requirements.txt&lt;/span&gt;&lt;/code&gt; (Python), &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;environment.yml&lt;/span&gt;&lt;/code&gt; (conda), or
whatever is suitable for your language, you can easily re-generate
environments to bring everything up to date.  (delete old one,
re-create).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you pin versions of dependencies (like &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;numpy==1.20.0&lt;/span&gt;&lt;/code&gt;), it’s
possible it can pull in older versions of other dependencies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Containerize your workflows.  If code runs inside of a container, it
keeps it isolated from the rest of the operating system and user
account.  (but containers aren’t usually designed for strict
security, but it’s better than nothing).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you use pre-built modules on the cluster, try not to use old
versions.  We’ll update some recent modules, but we can’t update all
of the old ones.  At least &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;webp&lt;/span&gt;&lt;/code&gt; is in the default &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;anaconda&lt;/span&gt;&lt;/code&gt;
modules.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you write or maintain software in general, keep it up to date as
much as reasonable!  Don’t make others get into a place where they
are having to use old versions of libraries to make it work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In general, think about your dependencies.  Be at least a little bit
suspicious before you install random other software, that may
possibly pull in lots of other dependencies.  Of course, as a
researcher, you may not have much choice.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;section id="updating-your-environments"&gt;
&lt;h3&gt;Updating your environments&lt;/h3&gt;
&lt;p&gt;These commands seem to be able to update an environment to a newer
libwebp.  It &lt;em&gt;seems&lt;/em&gt; to work on newer environments, but we don’t know
for sure.  Instead of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;mamba&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;conda&lt;/span&gt;&lt;/code&gt; in theory works but is to
slow it may not be practical:&lt;/p&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;mamba&lt;span class="w"&gt; &lt;/span&gt;env&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;environment.,yml
&lt;span class="gp"&gt;$ &lt;/span&gt;perl&lt;span class="w"&gt; &lt;/span&gt;-i&lt;span class="w"&gt; &lt;/span&gt;-pe&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s/(libwebp(-base)?)=.*$/\1=1.3.2/g&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;environment.yml
&lt;span class="gp"&gt;$ &lt;/span&gt;mamba&lt;span class="w"&gt; &lt;/span&gt;env&lt;span class="w"&gt; &lt;/span&gt;update&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;environment.yml
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="summary"&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;There is a major security vulnerability in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;libwebp&lt;/span&gt;&lt;/code&gt;.  While the
impact on computational scientists may not be &lt;em&gt;that&lt;/em&gt; much, a bigger
issue is the difficulty of keeping all of the environments up to date
so that next time this happens, it’s easier to respond.&lt;/p&gt;
&lt;p&gt;We hope to have more security recommendations for computational
scientists in the future.  If anyone is interested in collaborating on
this, let us know.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="aside-what-s-affected"&gt;
&lt;h2&gt;Aside: What’s affected?&lt;/h2&gt;
&lt;p&gt;Common apps which embed Chrome or libwebp: Chrome, Firefox, VSCode,
Zulip, Slack, Discord… things that use Electron to embed a web
browser are affected, and that’s &lt;em&gt;many&lt;/em&gt; things.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/webp-security-vulnerability/"/>
    <summary>Recently, a major security vulnerability (CVE-2023-5129) has been
found in libwebp, an image decoding library for the .webp format.
This is major, since this library is embedded in many apps and web
browsers and allows remote code execution just by opening a file.  For
computational scientists, there is still some impact - and it’s harder
to compensate for.  In short, just by processing an image in the .webp
format, someone can take over your computer or session.libwebp is the current issue, but the problem is general:
computational scientists often create software environments and use
them for a long
time.  These environments aren’t usually browsing the web (the most
likely attack vector here), but they do involve lots of code installed
from different projects.  How does one manage security in this case?This post may be updatedIf you use web browsers or apps on your own desktops, laptops, phones,
etc. - make sure update them!If you don’t use images in your research, there probably isn’t much
impact.If you do, this is what could happen:You make a Python / Anaconda environment which uses libwebp
somehow - directly installed through Conda, or some other
application.You download a dataset containing images.  You process them as part
of your research with the old environment.The malicious image runs an exploit.  It has access to your whole
user account on that computer: extract any data, add SSH keys for
remote access, corrupt/delete data (which might not be backed up
from the cluster…).Many things have to happen here, but it’s very possible for it to
happen.  You could lose access to non-backed up data or code or
other confidential or sensitive data could be compromised, since
code from one project from your user account has access to all
projects from your account.One would normally fix things by updating software.  But when you are
dealing with a research environment that can’t easily be updated, what
should you do?  This is the real question here.It’s a multi-layered problem, and the answer will depend on your
work.  libwebp is what we are thinking about now, but the problem is
general: there are other security problems that occasionally come up
that can affect more scientific code.  How do you prepare for next time?Update your environments (conda, virtualenv, etc).  You could try to
see if libwebp is inside of them (conda list | grep webp),
but especially for Pip packages it might not be apparent.Make your environments reproducible: If you define your dependencies
in requirements.txt (Python), environment.yml (conda), or
whatever is suitable for your language, you can easily re-generate
environments to bring everything up to date.  (delete old one,
re-create).If you pin versions of dependencies (like numpy==1.20.0), it’s
possible it can pull in older versions of other dependencies.Containerize your workflows.  If code runs inside of a container, it
keeps it isolated from the rest of the operating system and user
account.  (but containers aren’t usually designed for strict
security, but it’s better than nothing).If you use pre-built modules on the cluster, try not to use old
versions.  We’ll update some recent modules, but we can’t update all
of the old ones.  At least webp is in the default anaconda
modules.If you write or maintain software in general, keep it up to date as
much as reasonable!  Don’t make others get into a place where they
are having to use old versions of libraries to make it work.In general, think about your dependencies.  Be at least a little bit
suspicious before you install random other software, that may
possibly pull in lots of other dependencies.  Of course, as a
researcher, you may not have much choice.These commands seem to be able to update an environment to a newer
libwebp.  It seems to work on newer environments, but we don’t know
for sure.  Instead of mamba, conda in theory works but is to
slow it may not be practical:There is a major security vulnerability in libwebp.  While the
impact on computational scientists may not be that much, a bigger
issue is the difficulty of keeping all of the environments up to date
so that next time this happens, it’s easier to respond.We hope to have more security recommendations for computational
scientists in the future.  If anyone is interested in collaborating on
this, let us know.Common apps which embed Chrome or libwebp: Chrome, Firefox, VSCode,
Zulip, Slack, Discord… things that use Electron to embed a web
browser are affected, and that’s many things.</summary>
    <published>2023-09-28T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/ssh-keys-with-passwords/</id>
    <title>Aalto public servers requiring passwords with SSH keys</title>
    <updated>2023-09-27T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="aalto-public-servers-requiring-passwords-with-ssh-keys"&gt;

&lt;p&gt;From 2023-09-25, publicly accessible Aalto server login is changing
and will now require a password in addition to SSH keys.  This will
have a significant usability impact on some users.  This post is made
as a landing page for users who need immediate, practical help and for
whom the &lt;a class="reference external" href="https://www.aalto.fi/en/news/ssh-connections-to-public-linux-servers-from-outside-the-aalto-network-will-require-both-a-password"&gt;aalto.fi page&lt;/a&gt;
isn’t findable or detailed enough.  The official contact is the &lt;a class="reference external" href="https://www.aalto.fi/en/services/it-service-desk-contact-information-and-service-hours"&gt;IT
Services service desk&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The reference page &lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/ssh/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SSH&lt;/span&gt;&lt;/a&gt; has been updated to
include detailed reference information for every common operating
system and SSH client.  Secure Shell is one of the standard methods of
connecting to remote servers and it is important that users of all
skill levels are able to use it securely.&lt;/p&gt;
&lt;p&gt;This change is &lt;em&gt;not&lt;/em&gt; from Science-IT, but since it will affect many of
our users but is not being publicized or supported very much, we are
preemptively doing some major user support.&lt;/p&gt;
&lt;section id="what-s-happening"&gt;
&lt;h2&gt;What’s happening&lt;/h2&gt;
&lt;p&gt;What is &lt;strong&gt;not&lt;/strong&gt; happening is: requiring locally encrypted SSH keys (although this is highly recommended).&lt;/p&gt;
&lt;p&gt;What &lt;strong&gt;is&lt;/strong&gt; happening: When you connect to an SSH server from outside
Aalto networks, you will need to have an SSH key set up &lt;strong&gt;and&lt;/strong&gt; send
your Aalto password to the remote server interactively.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-to-do"&gt;
&lt;h2&gt;What to do&lt;/h2&gt;
&lt;p&gt;If you already have an SSH key set up, you’ll start to be asked to
enter a password every time you connect.&lt;/p&gt;
&lt;p&gt;You can always connect to the Aalto VPN in advance to prevent this,
but there may be cases where this isn’t a practical solution.&lt;/p&gt;
&lt;p&gt;If you do not have an SSH key set up, you should:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Follow &lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/ssh/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SSH&lt;/span&gt;&lt;/a&gt; to generate an SSH key - we have
&lt;em&gt;heavily&lt;/em&gt; revised this page to cover almost every common SSH
arrangement.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Place your SSH key on any common Aalto server (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;kosh&lt;/span&gt;&lt;/code&gt;, etc. -
&lt;em&gt;not&lt;/em&gt; Triton since that doesn’t share home directories with the
public servers)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You could connect by VPN, and then use normal password to connect
and add the key.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You could use &lt;a class="reference external" href="https://vdi.aalto.fi"&gt;https://vdi.aalto.fi&lt;/a&gt; with a Linux computer to place
the key.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You could place the key while on an Aalto network (as usual, this
means &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;eduroam&lt;/span&gt;&lt;/code&gt; or &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;aalto&lt;/span&gt;&lt;/code&gt; &lt;em&gt;only&lt;/em&gt; from an Aalto computer).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You could use another computer that’s already set up with an SSH
key to place the key.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The key will then be available on all common Aalto shell servers
(and other workstations), since they share the home directory.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Re-read &lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/ssh/" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SSH&lt;/span&gt;&lt;/a&gt;, in particular the
&lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/ssh/#ssh-agent" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;SSH key agent&lt;/span&gt;&lt;/a&gt;, &lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/ssh/#proxyjump" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;ProxyJump&lt;/span&gt;&lt;/a&gt; and
&lt;a class="reference external" href="https://scicomp.aalto.fi/scicomp/ssh/#ssh-multiplex" title="(in Aalto Scientific Computing)"&gt;&lt;span&gt;Multiplexing&lt;/span&gt;&lt;/a&gt; sections, to see how to configure your
SSH to minimize the number of times you need to enter passwords.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="motivations"&gt;
&lt;h2&gt;Motivations&lt;/h2&gt;
&lt;p&gt;This was needed for security as evidenced by recent history.
Password-only login is simply not feasible anymore (nor for some
time).  Removing passwords as an option is good security practice that
most organizations should adopt these days.&lt;/p&gt;
&lt;p&gt;But why an ssh key &lt;em&gt;and&lt;/em&gt; remote password instead of a properly
encrypted SSH key?  An SSH key requires something you have (the key)
and something you know (the password), doesn’t it?  And doesn’t
require sending a plaintext password to the remote server.  This was
decided by whoever is setting this up, probably partly due to the
fact that it is not possible to enforce passwords on SSH keys via
the server config.&lt;/p&gt;
&lt;p&gt;In general (outside of Aalto), you should use SSH keys everywhere and
be wary of ever sending plaintext passwords to remote servers
(even in conjunction with an SSH key).  Security is important, and by
using SSH keys with local encryption of the key you are doing your part.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="this-is-affecting-important-workflows"&gt;
&lt;h2&gt;This is affecting important workflows&lt;/h2&gt;
&lt;p&gt;We apologize for the difficulty in getting work done and want to help
you as much as possible (though Science-IT was &lt;em&gt;not&lt;/em&gt; the ones that
designed this or communicated it).&lt;/p&gt;
&lt;p&gt;There are, unfortunately, some trivial workarounds that involve
putting your password in plain text on your computer to script things.
However, please note that writing passwords down (outside of password
managers) is bad security practise and against the &lt;a class="reference external" href="https://www.aalto.fi/en/services/password-guidelines"&gt;Aalto password guidelines&lt;/a&gt;. It is better to
&lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;contact us&lt;/span&gt;&lt;/a&gt; to
help design a better and more secure workflow, or ask &lt;a class="reference external" href="https://www.aalto.fi/en/services/it-services"&gt;IT Services&lt;/a&gt; and ask them to
consider other use cases.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/ssh-keys-with-passwords/"/>
    <summary>From 2023-09-25, publicly accessible Aalto server login is changing
and will now require a password in addition to SSH keys.  This will
have a significant usability impact on some users.  This post is made
as a landing page for users who need immediate, practical help and for
whom the aalto.fi page
isn’t findable or detailed enough.  The official contact is the IT
Services service deskThe reference page scicomp/ssh has been updated to
include detailed reference information for every common operating
system and SSH client.  Secure Shell is one of the standard methods of
connecting to remote servers and it is important that users of all
skill levels are able to use it securely.This change is not from Science-IT, but since it will affect many of
our users but is not being publicized or supported very much, we are
preemptively doing some major user support.What is not happening is: requiring locally encrypted SSH keys (although this is highly recommended).What is happening: When you connect to an SSH server from outside
Aalto networks, you will need to have an SSH key set up and send
your Aalto password to the remote server interactively.If you already have an SSH key set up, you’ll start to be asked to
enter a password every time you connect.You can always connect to the Aalto VPN in advance to prevent this,
but there may be cases where this isn’t a practical solution.If you do not have an SSH key set up, you should:Follow scicomp/ssh to generate an SSH key - we have
heavily revised this page to cover almost every common SSH
arrangement.Place your SSH key on any common Aalto server (kosh, etc. -
not Triton since that doesn’t share home directories with the
public servers)You could connect by VPN, and then use normal password to connect
and add the key.You could use https://vdi.aalto.fi with a Linux computer to place
the key.You could place the key while on an Aalto network (as usual, this
means eduroam or aalto only from an Aalto computer).You could use another computer that’s already set up with an SSH
key to place the key.The key will then be available on all common Aalto shell servers
(and other workstations), since they share the home directory.Re-read scicomp/ssh, in particular the
ssh-agent, proxyjump and
ssh-multiplex sections, to see how to configure your
SSH to minimize the number of times you need to enter passwords.This was needed for security as evidenced by recent history.
Password-only login is simply not feasible anymore (nor for some
time).  Removing passwords as an option is good security practice that
most organizations should adopt these days.But why an ssh key and remote password instead of a properly
encrypted SSH key?  An SSH key requires something you have (the key)
and something you know (the password), doesn’t it?  And doesn’t
require sending a plaintext password to the remote server.  This was
decided by whoever is setting this up, probably partly due to the
fact that it is not possible to enforce passwords on SSH keys via
the server config.In general (outside of Aalto), you should use SSH keys everywhere and
be wary of ever sending plaintext passwords to remote servers
(even in conjunction with an SSH key).  Security is important, and by
using SSH keys with local encryption of the key you are doing your part.We apologize for the difficulty in getting work done and want to help
you as much as possible (though Science-IT was not the ones that
designed this or communicated it).There are, unfortunately, some trivial workarounds that involve
putting your password in plain text on your computer to script things.
However, please note that writing passwords down (outside of password
managers) is bad security practise and against the Aalto password guidelines. It is better to
contact us to
help design a better and more secure workflow, or ask IT Services and ask them to
consider other use cases.</summary>
    <published>2023-09-27T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/preparing-for-new-triton/</id>
    <title>Preparing for new Triton</title>
    <updated>2023-09-12T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="preparing-for-new-triton"&gt;

&lt;p&gt;Sometime in autumn of 2023 (e.g. October/November), we will do a major
update of Triton: updating the basic operating system, and thus almost
everything else.  There are big benefits to this: newer basic
operating system software, but also such a basic update affects almost
every user.  &lt;strong&gt;For a short time, this will make a lot of work for almost
every user.  This post gives advance warning and a chance of feedback
of how to make the update most usable.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This post is just advance warning and things to prepare already.  All
actual instructions will come later.&lt;/p&gt;
&lt;section id="what-will-happen"&gt;
&lt;h2&gt;What will happen&lt;/h2&gt;
&lt;p&gt;We will update the basic operating system from CentOS 7 to something
else (Red Hat 9).  We’ve ordered all new management hardware
to make the backend more reliable and manageable.  Along with this
comes with an update of the software build system, which should allow
us to deploy software to our users even better.  We’ll also update our
configuration management system for more reproducibility.&lt;/p&gt;
&lt;p&gt;We also hope to think about the usability of the new system: remove a
lot of old options and add in new, simpler ways of doing what people
need.&lt;/p&gt;
&lt;p&gt;All data and storage will remain the same, so there &lt;strong&gt;is no big data
migration needed.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The old and new clusters will be accessible at the same time (two
different login nodes), with the same filesystems mounted (same data
available) and some compute resources still available there, so that
people can slowly migrate.  But the old one won’t stay running too
long, to avoid long maintenance effort or splitting of the resources.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="reproduciblity"&gt;
&lt;h2&gt;Reproduciblity&lt;/h2&gt;
&lt;p&gt;The biggest problem with big cluster updates like this is
&lt;strong&gt;reproducibility&lt;/strong&gt;: does you work from a month ago still work in one
month?  If not, this is a big problem.  It’s even worse if there is a
much longer gap before you come back to it (paper revisions, anyone?).&lt;/p&gt;
&lt;p&gt;You could say there are two things that can go wrong with a cluster upgrade or change:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Specific software/code that needs to be compiled and installed:&lt;/strong&gt;
Software needs re-compiling for new clusters or new cluster OS updates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Whole workflows:&lt;/strong&gt; you need to make all the pieces work together.
Different paths and workflow managers may need updating.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What you can do:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Manage any messes you have earlier rather than later.  It’s better
if you slowly clean up over time, so you can focus on the
differences once the change happens.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Know what software you are using.  It’s easier for us to re-install something we
have already installed when someone can tell us the exact name and version
that they are using.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://coderefinery.github.io/testing/"&gt;Tests for your software&lt;/a&gt;.  Some way to validate
that it works correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Contact &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/"&gt;Aalto RSE&lt;/a&gt; for hands-on
help supporting the transition.  Come to the &lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/"&gt;garage&lt;/a&gt; early and often.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="feedback-and-future-usability"&gt;
&lt;h2&gt;Feedback and future usability&lt;/h2&gt;
&lt;p&gt;If there are any annoyances about Triton that you’d like us to
consider for the upgrade, now is the time to let us know so we can
plan them.  &lt;strong&gt;We especially value feedback on usability problems.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Discuss with us in &lt;a class="reference external" href="https://scicomp.zulip.cs.aalto.fi/#narrow/stream/6-triton/topic/feedback.on.new.Triton"&gt;our chat&lt;/a&gt;,
or &lt;a class="reference external" href="https://version.aalto.fi/gitlab/AaltoScienceIT/triton/issues/"&gt;open a Triton issue&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This post has been updated with minor corrections, changes be found in
git history.&lt;/em&gt;&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/preparing-for-new-triton/"/>
    <summary>Sometime in autumn of 2023 (e.g. October/November), we will do a major
update of Triton: updating the basic operating system, and thus almost
everything else.  There are big benefits to this: newer basic
operating system software, but also such a basic update affects almost
every user.  For a short time, this will make a lot of work for almost
every user.  This post gives advance warning and a chance of feedback
of how to make the update most usable.This post is just advance warning and things to prepare already.  All
actual instructions will come later.We will update the basic operating system from CentOS 7 to something
else (Red Hat 9).  We’ve ordered all new management hardware
to make the backend more reliable and manageable.  Along with this
comes with an update of the software build system, which should allow
us to deploy software to our users even better.  We’ll also update our
configuration management system for more reproducibility.We also hope to think about the usability of the new system: remove a
lot of old options and add in new, simpler ways of doing what people
need.All data and storage will remain the same, so there is no big data
migration needed.The old and new clusters will be accessible at the same time (two
different login nodes), with the same filesystems mounted (same data
available) and some compute resources still available there, so that
people can slowly migrate.  But the old one won’t stay running too
long, to avoid long maintenance effort or splitting of the resources.The biggest problem with big cluster updates like this is
reproducibility: does you work from a month ago still work in one
month?  If not, this is a big problem.  It’s even worse if there is a
much longer gap before you come back to it (paper revisions, anyone?).You could say there are two things that can go wrong with a cluster upgrade or change:Specific software/code that needs to be compiled and installed:
Software needs re-compiling for new clusters or new cluster OS updates.Whole workflows: you need to make all the pieces work together.
Different paths and workflow managers may need updating.What you can do:Manage any messes you have earlier rather than later.  It’s better
if you slowly clean up over time, so you can focus on the
differences once the change happens.Know what software you are using.  It’s easier for us to re-install something we
have already installed when someone can tell us the exact name and version
that they are using.Tests for your software.  Some way to validate
that it works correctly.Contact Aalto RSE for hands-on
help supporting the transition.  Come to the garage early and often.If there are any annoyances about Triton that you’d like us to
consider for the upgrade, now is the time to let us know so we can
plan them.  We especially value feedback on usability problems.Discuss with us in our chat,
or open a Triton issue.This post has been updated with minor corrections, changes be found in
git history.</summary>
    <published>2023-09-12T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/aalto-rse-hiring-process/</id>
    <title>The Aalto RSE hiring process</title>
    <updated>2023-08-21T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="the-aalto-rse-hiring-process"&gt;

&lt;p&gt;This post describes the hiring process of Aalto RSE.  The goal is to
make hiring more equitable by providing the background information so
that everyone can apply successfully.  For those not applying to us,
it might still provide some valuable insight about how to market your
skills as a PhD making a sideways career move.  What’s said here may
not apply to every organization, but it might give you some things to
think about.&lt;/p&gt;
&lt;p&gt;Disclaimer: This page is a rough average description of the past, not
a promise to always do this in the future.&lt;/p&gt;
&lt;section id="background"&gt;
&lt;h2&gt;Background&lt;/h2&gt;
&lt;p&gt;&lt;a class="reference external" href="https://scicomp.aalto.fi/rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Aalto RSE&lt;/span&gt;&lt;/a&gt; has usually hired people who have postdoc
experience and will transition to a more applied
software/data/computing oriented role (as opposed to being focused on
writing papers).  For many people, we are the first experience of job
applications post-degree and thus people have to learn how to present
their skills in a new, non-academic context.&lt;/p&gt;
&lt;p&gt;One should start by reading about us - we have lots of information
publicly available about what we do and how we think.  This should be
understood in order to do the next steps well.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="the-cover-letter"&gt;
&lt;h2&gt;The cover letter&lt;/h2&gt;
&lt;p&gt;The cover letter is the most important thing we read, and the first
and most important filter.  It’s read before the CV.&lt;/p&gt;
&lt;p&gt;At the level we are at, almost everyone’s CV and achievements are
effectively equivalent.  Does it matter who got the most fancy papers?
Who has the most awards?  The classes people took?  When most of a
person’s knowledge has come from self-study, probably not.  The cover
letter is the chance to interpret your skills in the context of the
job you are applying for.&lt;/p&gt;
&lt;p&gt;When reading the cover letter, the first question we ask is “does this
person know what they are applying to and know why they think they are
a good fit?”  (It’s always interesting to get letters which clearly
don’t understand the job, but on the other hand it’s an easy filter.)
The first paragraph should answer this question and that the rest of
the letter will go into detail about why.  Start with the most
important information, don’t make it hard for us.&lt;/p&gt;
&lt;p&gt;Beyond that, talk about interests and skills as relevant to the
organization.  Discuss special projects, including non-academic ones
or random things that you are interested in (this is especially true
for us, since we are the transition from academia to practical work).
Our job advertisement gives you some specific ideas that you can talk
about.  Anything specifically important to the job should be pointed
out here and not just left in the CV.&lt;/p&gt;
&lt;p&gt;If you don’t exactly fit the stated job requirements: here is the
chance to explain it.  The job requirement has to say roughly what we
need (to not waste people’s time when applying, and because our hiring
decisions must be justifiable based on the requirements), but there
are many cases where someone with a different experience can
accomplish our actual goal (as said in the job ad or found in your
background research).  A person that can say this, that they are
adaptable, and will have a very good chance.&lt;/p&gt;
&lt;p&gt;We have adopted some system of anonymous recruiting.  We request that
cover letters are submitted without identifying information (name,
signature, etc) so that one person gives them numbers, and a broader
group tries to take a non-biased look at them.  After this initial
impression, we bring in the rest of the application.  Don’t make
assumptions about what the reader will know about your background,
just say it.&lt;/p&gt;
&lt;p&gt;The letter should be as short as possible to get the information
across.  One page is usually about the shortest we get, and a bit less
than two pages is typical.  But if it’s engaging, we’ll read as much
as you write.  Remember, most important information first, don’t make
us hunt for things.&lt;/p&gt;
&lt;p&gt;Update 2024: Do you want to use AI to write your cover letter?  Please
think again.  Since LLMs became a thing, cover letters have become
harder to read, longer, and more generic-sounding.  It’s better to
write in your own voice and be shorter than rely on what AI gives
you.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="the-rest-of-the-job-application"&gt;
&lt;h2&gt;The rest of the job application&lt;/h2&gt;
&lt;p&gt;The CV serves as non-anonymous reference information, but they are
hard to read and all look pretty similar.  To be honest, we don’t
worry that much about the format and contents here: get us basic
factual information in the most efficient way.  For our particular
jobs, non-academic skills such as software/data tools are more
important than scientific articles, etc.   Remember, we are busy
and have plenty of applications, make it easy to read.&lt;/p&gt;
&lt;p&gt;Open Science isn’t just good for research, it’s good for you, too.  If
you can point to public repositories of work you have done, this is
very useful.  Things like Gitlab/Github profiles with activity and
your own projects, links to data you have released, etc.  They don’t
have to be perfect - something is better than nothing.  The best case
would be a few projects which are well-done (and you know it and point
them out to us), and plenty more stuff that may be of lower quality to
show you can get simple stuff done simply.  Not everyone is fortunate
to have a field where they can practice open science throughout their
career, but even publishing a project or two before they apply for a
job with us is very useful.&lt;/p&gt;
&lt;p&gt;Despite what the previous section said, we do try to dig through
applications that seem on-topic but don’t say everything we are looking
for, to give them the most fair shot we can.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="the-filtering-process"&gt;
&lt;h2&gt;The filtering process&lt;/h2&gt;
&lt;p&gt;We always need to heavily filter the list down.  Some relevant
filtering includes:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Do they know what job they are applying for?  Can they connect their
skills to the job?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Have they touched on the main points in our job advertisement and
the linked “&lt;a class="reference external" href="https://scicomp.aalto.fi/rse/become-a-rse/" title="(in Aalto Scientific Computing)"&gt;&lt;span class="xref std std-doc"&gt;Become a RSE&lt;/span&gt;&lt;/a&gt;” page?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Are they interested in teaching, mentoring, and real collaborative
projects?  Do they know what kind of teaching and mentoring we do?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Is there enough knowledge about the research process?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any relevant skills about this call’s particular topic (if there is
any)?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How do their skills and experience match what our team is currently
missing, regardless of the open call?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How similar has their previous work been to “research engineering”
(helping the research process) instead of only focusing on academic
promotion?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The recruitment team makes several passes over and we discuss how to
filter down.  We try to get a good variety of candidates.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="interviews"&gt;
&lt;h2&gt;Interviews&lt;/h2&gt;
&lt;p&gt;Sometimes, there is some initial recorded “video interviews”, which
provide some initial familiarity in both directions before the actual
interviews.  We know these are non-interactive and a recording isn’t a
conversation so this is harder than an interview, but we consider that
when watching them.  One shouldn’t worry too much about these, if we
do them.&lt;/p&gt;
&lt;p&gt;Our actual interviews are not designed to be stressful.  We have some
prepared questions and go through them in a friendly manner.  You have
a chance to ask questions to use at the beginning and end (and any
other time too).  The questions are designed to hear about your
experiences and not trick or test you.&lt;/p&gt;
&lt;p&gt;We don’t currently ask technical challenge questions.  The number of
things which you’d need to know is so broad, it’s more important that
you can learn things quickly.  Since we usually interview relatively
advanced people, we can instead look at existing projects they have
done and check references, without having to do a technical
challenge.  This may change depending on the type of candidates we are
interviewing, but just like the main interviews we are more interested
in how people think, rather than raw knowledge.&lt;/p&gt;
&lt;p&gt;In the future, there might be more “meet the team” kind of events.&lt;/p&gt;
&lt;p&gt;We want to respond to people as soon as possible, but there’s a simple
fact: we don’t want to tell anyone “no” until we are very sure we have
an acceptance (we don’t want to tell someone “no” and then hire them
later), and we have very many qualified candidates.  So there is often
an unfortunately long delay in hearing back.  We hope that everyone
knows within a month, though (and ideally ~2 weeks if all goes well).&lt;/p&gt;
&lt;/section&gt;
&lt;section id="if-you-don-t-make-it"&gt;
&lt;h2&gt;If you don’t make it&lt;/h2&gt;
&lt;p&gt;We get a relatively large number of applications, with a lot of good
people.  So far (before 2023), we have been hiring at a relatively
high level - researchers with postdoc experience who have been some
sort of RSE-like experience with helping others with research (beyond
only focusing on making papers for themselves) and technology.
Don’t let this discourage you.  There are many qualified applications,
so if you don’t get selected, that doesn’t mean that you were
unqualified.  We look at everyone, regardless of their level, for
every position.  The fit to our particular job is more important that
anything else, so keep trying until you get the right fit - it’s just
a numbers game.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="old-job-application-text"&gt;
&lt;h2&gt;Old job application text&lt;/h2&gt;
&lt;p&gt;For reference, this is an older job application text, so that you can
see how the things above are integrated.  (to be updated with the 2023
version soon)&lt;/p&gt;
&lt;div class="dropdown admonition"&gt;
&lt;p class="admonition-title"&gt;RSE job advertisement, 2022&lt;/p&gt;
&lt;p&gt;[ standard header removed ]&lt;/p&gt;
&lt;p&gt;Aalto Scientific Computing is looking for a&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Research Software Engineer/Supporter&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To a permanent, full-time position.&lt;/p&gt;
&lt;p&gt;Are you more of a programmer than your researcher colleagues? Are you
more of a researcher than commercial developers? Do you fit in both, but
have a home in neither? Be a Research Software Engineer with us and find
your home. If you are looking for a career path which combines the
interesting parts of both fields, this is a good choice.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://scicomp.aalto.fi/about/"&gt;Aalto Scientific Computing&lt;/a&gt; is an
elite “special forces” unit of Research IT, providing high-performance
computing hardware, management, research support, teaching, and
training. Our team consists of a core of PhD staff working with top
researchers throughout the university. Our services are used by every
school at Aalto University and known throughout Finland and the Nordics.
All our work is open-source by default and we take an active part in
worldwide projects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In this position, you will:&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Provide software development and consulting as a service, depending
on demand from research groups.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Provide one-on-one research support from a software, programming,
Linux, data, and infrastructure perspective: short-term projects
helping researchers with specific tasks, so that the researchers
gain competence to work independently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;As needed and depending on interest, teaching and other research
infrastructure support.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Continually learn new skills as part of our team.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Primary qualifications&lt;/strong&gt;: There are two main tracks, and candidates of
diverse backgrounds are encouraged to apply – every candidate will be
evaluated according to their own unique experiences.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;PhD degree with research experience in some computational field and
much knowledge of practical computing strategies for research, or&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Software developer or computational scientist with a strong
software/open source/Linux background, scientific computing
experience, and some experience in research. Masters degree or
similar experience.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;This particular call emphasizes the ability to work in machine
learning and AI environments&lt;/strong&gt;. The ideal candidate will be working
closely with machine learning researchers, and thus a background in
machine learning is highly desirable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Important skills:&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Ability to tackle any problem with a researcher’s mindset and a
developer’s passion for technology.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Experience or knowledge of the principles of open source software,
open science, and software development tools such as version
control.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Please see &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/become-a-rse/"&gt;https://scicomp.aalto.fi/rse/become-a-rse/&lt;/a&gt; for more
information on what kind of skills we value - or more precisely
what you are likely to learn.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;What we offer&lt;/strong&gt;:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;You will join the dynamic Aalto Scientific Computing team, where you
will learn from some of the best research IT specialists in
Finland.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Co-working within top-quality research groups, getting experience in
a wide variety of fields and developing an extensive network of
scientific contacts. This includes contacts to the Aalto startup
scene and community.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A way to be close to the research process while focusing on
interesting computational problems and not the publication
process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Our program will offer you a chance to improve your software skills –
you are expected to engage in plenty of professional development.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open Source is our expectation. All (or most) of your code may be
open source and may be added to your public CV, depending on the
needs of researchers.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Salary will be according to experience, for a recently graduated PhD
similar to a postdoc salary. Work hours are flexible, but are expected
to sync with the audience being served. Primary workplace is Otaniemi,
Espoo (Helsinki region), Finland. Aalto University has a hybrid work
policy which allows 60% remote work possibility, and our team takes good
advantage of this flexibility.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;To apply successfully&lt;/strong&gt;:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Please include a separate cover letter (~1-2 pages). Please try to
write your cover letter avoiding information like name, gender,
nationality or other demographic information that is not directly
related to why you would be the right person for this position
(this includes, for example, a signature on the letter) unless you
think it benefits you. This will assist in anonymous recruitment
possibilities. The letter should include for example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Why being a Research Software Engineer is for you,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;past research experience, if any&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;past technical teaching or mentoring experience,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;past software development experience (even informal
self-learning),&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;past Linux, command line, or scripting experience,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;highlight one (or a few) collaborative projects you have taken
part in and your role within it, and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;what you bring and what you intend to learn.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A normal professional or academic CV including&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;a list of your technical and programming tools and level of
proficiency (e.g. basic/proficient/expert). This is the time to
show the breadth of your experience.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Github link or other public sample code. If not available,
whatever is possible to demonstrate past programming
experience. Please highlight one or two of your outstanding
research software projects.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[ standard footer removed ]&lt;/p&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/aalto-rse-hiring-process/"/>
    <summary>This post describes the hiring process of Aalto RSE.  The goal is to
make hiring more equitable by providing the background information so
that everyone can apply successfully.  For those not applying to us,
it might still provide some valuable insight about how to market your
skills as a PhD making a sideways career move.  What’s said here may
not apply to every organization, but it might give you some things to
think about.Disclaimer: This page is a rough average description of the past, not
a promise to always do this in the future.Aalto RSE has usually hired people who have postdoc
experience and will transition to a more applied
software/data/computing oriented role (as opposed to being focused on
writing papers).  For many people, we are the first experience of job
applications post-degree and thus people have to learn how to present
their skills in a new, non-academic context.One should start by reading about us - we have lots of information
publicly available about what we do and how we think.  This should be
understood in order to do the next steps well.The cover letter is the most important thing we read, and the first
and most important filter.  It’s read before the CV.At the level we are at, almost everyone’s CV and achievements are
effectively equivalent.  Does it matter who got the most fancy papers?
Who has the most awards?  The classes people took?  When most of a
person’s knowledge has come from self-study, probably not.  The cover
letter is the chance to interpret your skills in the context of the
job you are applying for.When reading the cover letter, the first question we ask is “does this
person know what they are applying to and know why they think they are
a good fit?”  (It’s always interesting to get letters which clearly
don’t understand the job, but on the other hand it’s an easy filter.)
The first paragraph should answer this question and that the rest of
the letter will go into detail about why.  Start with the most
important information, don’t make it hard for us.Beyond that, talk about interests and skills as relevant to the
organization.  Discuss special projects, including non-academic ones
or random things that you are interested in (this is especially true
for us, since we are the transition from academia to practical work).
Our job advertisement gives you some specific ideas that you can talk
about.  Anything specifically important to the job should be pointed
out here and not just left in the CV.If you don’t exactly fit the stated job requirements: here is the
chance to explain it.  The job requirement has to say roughly what we
need (to not waste people’s time when applying, and because our hiring
decisions must be justifiable based on the requirements), but there
are many cases where someone with a different experience can
accomplish our actual goal (as said in the job ad or found in your
background research).  A person that can say this, that they are
adaptable, and will have a very good chance.We have adopted some system of anonymous recruiting.  We request that
cover letters are submitted without identifying information (name,
signature, etc) so that one person gives them numbers, and a broader
group tries to take a non-biased look at them.  After this initial
impression, we bring in the rest of the application.  Don’t make
assumptions about what the reader will know about your background,
just say it.The letter should be as short as possible to get the information
across.  One page is usually about the shortest we get, and a bit less
than two pages is typical.  But if it’s engaging, we’ll read as much
as you write.  Remember, most important information first, don’t make
us hunt for things.Update 2024: Do you want to use AI to write your cover letter?  Please
think again.  Since LLMs became a thing, cover letters have become
harder to read, longer, and more generic-sounding.  It’s better to
write in your own voice and be shorter than rely on what AI gives
you.The CV serves as non-anonymous reference information, but they are
hard to read and all look pretty similar.  To be honest, we don’t
worry that much about the format and contents here: get us basic
factual information in the most efficient way.  For our particular
jobs, non-academic skills such as software/data tools are more
important than scientific articles, etc.   Remember, we are busy
and have plenty of applications, make it easy to read.Open Science isn’t just good for research, it’s good for you, too.  If
you can point to public repositories of work you have done, this is
very useful.  Things like Gitlab/Github profiles with activity and
your own projects, links to data you have released, etc.  They don’t
have to be perfect - something is better than nothing.  The best case
would be a few projects which are well-done (and you know it and point
them out to us), and plenty more stuff that may be of lower quality to
show you can get simple stuff done simply.  Not everyone is fortunate
to have a field where they can practice open science throughout their
career, but even publishing a project or two before they apply for a
job with us is very useful.Despite what the previous section said, we do try to dig through
applications that seem on-topic but don’t say everything we are looking
for, to give them the most fair shot we can.We always need to heavily filter the list down.  Some relevant
filtering includes:Do they know what job they are applying for?  Can they connect their
skills to the job?Have they touched on the main points in our job advertisement and
the linked “Become a RSE” page?Are they interested in teaching, mentoring, and real collaborative
projects?  Do they know what kind of teaching and mentoring we do?Is there enough knowledge about the research process?Any relevant skills about this call’s particular topic (if there is
any)?How do their skills and experience match what our team is currently
missing, regardless of the open call?How similar has their previous work been to “research engineering”
(helping the research process) instead of only focusing on academic
promotion?The recruitment team makes several passes over and we discuss how to
filter down.  We try to get a good variety of candidates.Sometimes, there is some initial recorded “video interviews”, which
provide some initial familiarity in both directions before the actual
interviews.  We know these are non-interactive and a recording isn’t a
conversation so this is harder than an interview, but we consider that
when watching them.  One shouldn’t worry too much about these, if we
do them.Our actual interviews are not designed to be stressful.  We have some
prepared questions and go through them in a friendly manner.  You have
a chance to ask questions to use at the beginning and end (and any
other time too).  The questions are designed to hear about your
experiences and not trick or test you.We don’t currently ask technical challenge questions.  The number of
things which you’d need to know is so broad, it’s more important that
you can learn things quickly.  Since we usually interview relatively
advanced people, we can instead look at existing projects they have
done and check references, without having to do a technical
challenge.  This may change depending on the type of candidates we are
interviewing, but just like the main interviews we are more interested
in how people think, rather than raw knowledge.In the future, there might be more “meet the team” kind of events.We want to respond to people as soon as possible, but there’s a simple
fact: we don’t want to tell anyone “no” until we are very sure we have
an acceptance (we don’t want to tell someone “no” and then hire them
later), and we have very many qualified candidates.  So there is often
an unfortunately long delay in hearing back.  We hope that everyone
knows within a month, though (and ideally ~2 weeks if all goes well).We get a relatively large number of applications, with a lot of good
people.  So far (before 2023), we have been hiring at a relatively
high level - researchers with postdoc experience who have been some
sort of RSE-like experience with helping others with research (beyond
only focusing on making papers for themselves) and technology.
Don’t let this discourage you.  There are many qualified applications,
so if you don’t get selected, that doesn’t mean that you were
unqualified.  We look at everyone, regardless of their level, for
every position.  The fit to our particular job is more important that
anything else, so keep trying until you get the right fit - it’s just
a numbers game.For reference, this is an older job application text, so that you can
see how the things above are integrated.  (to be updated with the 2023
version soon)[ standard header removed ]Aalto Scientific Computing is looking for aResearch Software Engineer/SupporterTo a permanent, full-time position.Are you more of a programmer than your researcher colleagues? Are you
more of a researcher than commercial developers? Do you fit in both, but
have a home in neither? Be a Research Software Engineer with us and find
your home. If you are looking for a career path which combines the
interesting parts of both fields, this is a good choice.Aalto Scientific Computing is an
elite “special forces” unit of Research IT, providing high-performance
computing hardware, management, research support, teaching, and
training. Our team consists of a core of PhD staff working with top
researchers throughout the university. Our services are used by every
school at Aalto University and known throughout Finland and the Nordics.
All our work is open-source by default and we take an active part in
worldwide projects.In this position, you will:Provide software development and consulting as a service, depending
on demand from research groups.Provide one-on-one research support from a software, programming,
Linux, data, and infrastructure perspective: short-term projects
helping researchers with specific tasks, so that the researchers
gain competence to work independently.As needed and depending on interest, teaching and other research
infrastructure support.Continually learn new skills as part of our team.Primary qualifications: There are two main tracks, and candidates of
diverse backgrounds are encouraged to apply – every candidate will be
evaluated according to their own unique experiences.PhD degree with research experience in some computational field and
much knowledge of practical computing strategies for research, orSoftware developer or computational scientist with a strong
software/open source/Linux background, scientific computing
experience, and some experience in research. Masters degree or
similar experience.This particular call emphasizes the ability to work in machine
learning and AI environments. The ideal candidate will be working
closely with machine learning researchers, and thus a background in
machine learning is highly desirable.Important skills:Ability to tackle any problem with a researcher’s mindset and a
developer’s passion for technology.Experience or knowledge of the principles of open source software,
open science, and software development tools such as version
control.Please see https://scicomp.aalto.fi/rse/become-a-rse/ for more
information on what kind of skills we value - or more precisely
what you are likely to learn.What we offer:You will join the dynamic Aalto Scientific Computing team, where you
will learn from some of the best research IT specialists in
Finland.Co-working within top-quality research groups, getting experience in
a wide variety of fields and developing an extensive network of
scientific contacts. This includes contacts to the Aalto startup
scene and community.A way to be close to the research process while focusing on
interesting computational problems and not the publication
process.Our program will offer you a chance to improve your software skills –
you are expected to engage in plenty of professional development.Open Source is our expectation. All (or most) of your code may be
open source and may be added to your public CV, depending on the
needs of researchers.Salary will be according to experience, for a recently graduated PhD
similar to a postdoc salary. Work hours are flexible, but are expected
to sync with the audience being served. Primary workplace is Otaniemi,
Espoo (Helsinki region), Finland. Aalto University has a hybrid work
policy which allows 60% remote work possibility, and our team takes good
advantage of this flexibility.To apply successfully:Please include a separate cover letter (~1-2 pages). Please try to
write your cover letter avoiding information like name, gender,
nationality or other demographic information that is not directly
related to why you would be the right person for this position
(this includes, for example, a signature on the letter) unless you
think it benefits you. This will assist in anonymous recruitment
possibilities. The letter should include for example:Why being a Research Software Engineer is for you,past research experience, if anypast technical teaching or mentoring experience,past software development experience (even informal
self-learning),past Linux, command line, or scripting experience,highlight one (or a few) collaborative projects you have taken
part in and your role within it, andwhat you bring and what you intend to learn.A normal professional or academic CV includinga list of your technical and programming tools and level of
proficiency (e.g. basic/proficient/expert). This is the time to
show the breadth of your experience.Github link or other public sample code. If not available,
whatever is possible to demonstrate past programming
experience. Please highlight one or two of your outstanding
research software projects.[ standard footer removed ]</summary>
    <published>2023-08-21T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/whisper-deployed-llms-coming/</id>
    <title>Whisper deployed on Triton, LLMs coming</title>
    <updated>2023-08-08T00:00:00+00:00</updated>
    <author>
      <name>Mira Salmensaari</name>
    </author>
    <content type="html">&lt;section id="whisper-deployed-on-triton-llms-coming"&gt;

&lt;section id="whisper-now-easily-available-for-researchers"&gt;
&lt;h2&gt;Whisper now easily available for researchers&lt;/h2&gt;
&lt;div class="admonition seealso"&gt;
&lt;p class="admonition-title"&gt;See also&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://scicomp.aalto.fi/triton/apps/whisper/"&gt;Whisper on Triton documentation&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a class="reference external" href="https://github.com/openai/whisper"&gt;OpenAI Whisper&lt;/a&gt; is a tool for
speech transcription.  It works well and has potential applications
in many different research and non-research use cases.  Using it isn’t
too hard - if you can install it and if you have a GPU.  Often, the
installing can become a big barrier, especially for “just testing”.&lt;/p&gt;
&lt;p&gt;Luckily, we have a &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/"&gt;cluster&lt;/a&gt; with
GPUs and a way to provide software for researchers.  We’ve made
Whisper available on the cluster as a &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/tut/modules/"&gt;module&lt;/a&gt;, so it’s trivial to
use it for any audio data you may have.  All one needs to do is:&lt;/p&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;module&lt;span class="w"&gt; &lt;/span&gt;load&lt;span class="w"&gt; &lt;/span&gt;whisper
&lt;span class="gp gp-VirtualEnv"&gt;(help gets printed out)&lt;/span&gt;
&lt;span class="gp"&gt;$ &lt;/span&gt;srun&lt;span class="w"&gt; &lt;/span&gt;--mem&lt;span class="o"&gt;=&lt;/span&gt;6G&lt;span class="w"&gt; &lt;/span&gt;singularity_wrapper&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;YOUR_FILE.wav&lt;span class="w"&gt; &lt;/span&gt;--model_directory&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$medium&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--local_files_only&lt;span class="w"&gt; &lt;/span&gt;True&lt;span class="w"&gt; &lt;/span&gt;--language&lt;span class="w"&gt; &lt;/span&gt;en
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;It might look complicated, but all you need to do is copy and paste.
The first words request the resources, the middle specifies your file,
and the last are some standard options to make it do things like use
our pre-downloaded model files.  Yes - this still requires knowledge
of how to use a cluster in general, but once you’ve got that
knowledge, transcribing audio is trivial.  We have a &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/#tutorials"&gt;self-study
course&lt;/a&gt; on cluster
usage, and users can always drop by and &lt;a class="reference external" href="https://scicomp.aalto.fi/help/"&gt;ask us for help&lt;/a&gt;, for example our daily garage each
day.&lt;/p&gt;
&lt;p&gt;See the &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/apps/whisper/"&gt;Whisper on Triton documentation&lt;/a&gt; for more
information on the use.&lt;/p&gt;
&lt;p&gt;We are also preparing a way to do this through the cluster web
interface &lt;a class="reference external" href="https://scicomp.aalto.fi/triton/usage/ood/"&gt;Open OnDemand&lt;/a&gt;, which will remove
most of the need to know how a cluster works and make the tool even
more accessible to other communities.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="llms-and-other-tools-next"&gt;
&lt;h2&gt;LLMs and other tools next&lt;/h2&gt;
&lt;p&gt;We hope to make other tools available like this.&lt;/p&gt;
&lt;p&gt;Whisper is just one of the latest tools, but you’ve probably noticed
that large language models are very popular these days.  There are, in
fact, some that can run locally on our own cluster, and our goal is to
deploy more of these so that they can be easily tested and used.  The
intention isn’t to make a replacement for existing LLM services, but
make internal for testing, research, and development use easier.&lt;/p&gt;
&lt;p&gt;Local installs have various benefits, including lower cost (since we
already own the hardware), being able to ensure reproducibility
longer-term (since models are locally downloaded and preserved), and
being able to use without various registrations.  The downside is that
the most popular ones ones aren’t available for local use.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="the-role-of-asc"&gt;
&lt;h2&gt;The role of ASC&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Contact us if you need other models deployed, or if you have trouble
using what’s already out there.  We are still in an early phase, and
there will probably be some difficulties in availability,
accessibility, and reusability.&lt;/strong&gt; &lt;a class="reference external" href="https://scicomp.aalto.fi/help/"&gt;Contact us early if you notice
anything that’s not right&lt;/a&gt;.  We
both help installing things and &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/"&gt;help using them as a research
engineer partner&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It’s clear that artificial intelligence and machine learning tools
will become more critical tools for other research.  The difficulty in
deploying and using them could become a barrier, and that is where
Aalto Scientific Computing comes in.  It’s our goal to make sure the
infrastructure that researchers need is ready and able to be used by
everyone, not just those with classic HPC experience.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="tech-details-difficulties-and-solutions"&gt;
&lt;h2&gt;Tech details: difficulties and solutions&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Here we go over some implementation details, which may help others
who want to deploy similar things on their own clusters.  If you just
want to use things, you don’t need to read on.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;We installed whisper in a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Singularity_(software)"&gt;container&lt;/a&gt;, so that all
dependencies are packaged together and things are portable.  The model
definitions themselves are &lt;em&gt;not&lt;/em&gt; included in the container, but
mounted in.  We try to find options that allow one to specify the
model and model directory, so that the user can try out different
models without downloading each one.  The Lmod module file prints out
some help when loaded.&lt;/p&gt;
&lt;p&gt;We’ve got two versions installed: normal Whisper, and
Whisper-diarization (which can identify speakers in the transcript).&lt;/p&gt;
&lt;p&gt;Whisper and diarization both have multiple different
implementations. It’s bit of guesswork to try to see which one is the
easiest to get running / works the best (not about quality of
transcript, but easy of deployment in container and with local
models). This led to a change to another implementation of diarization
midway since the current one is more active in development and seems
overall slightly better. A lot of the work was fortunately
transferable to the new implementation.&lt;/p&gt;
&lt;p&gt;There were the common issues with getting the right dependencies in a
container and getting the GPUs to work there.  This is pretty standard
by now.&lt;/p&gt;
&lt;p&gt;Most implementations of whisper want to download models when running
it. This might make sense for general user, but doesn’t really make
sense on cluster. Depending on the implementation, getting it to use
local models is not always trivial. Since GPU execution of diarization
uses several models at once, there doesn’t seem to be a simple way to
have it use local models at all without changing the code. It also
required some sleuthing to find where exactly the models are
downloaded.  If a code uses Hugging Face, &lt;a class="reference external" href="https://huggingface.co/docs/huggingface_hub/main/en/package_reference/environment_variables"&gt;these environment variables&lt;/a&gt;
can be useful.&lt;/p&gt;
&lt;p&gt;Making a module that is both easy/practical to use for users without
also losing options is usually bit tricky: we want users to be able to
do anything, for “the right thing” to happen automatically, and not
build some opaque framework to make it happen.  Singularity-wrapper
fortunately helps quite a bit in doing lot of background stuff such as
binding directories, gpu flags, etc. cleanly without users having to
care about it, while still giving the option to run the container
straight through Apptainer/Singularity if finer control is necessary.&lt;/p&gt;
&lt;p&gt;Testing if the containers work is somewhat annoying. Diarization in
particular saves a lot of cache files all over the place, which all
need to be purged when testing GPU running. Otherwise the GPU will
stay idle since everything it would do is already in cache.  This also
affects clean-up after users run the code.&lt;/p&gt;
&lt;p&gt;A minor inconvenience for us (but possibly large for users) is that
the syntax for each Whisper CLI implementation tends to differ
slightly. This makes swapping between implementations slightly
annoying since you have to check every time what was the syntax for
flags.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/whisper-deployed-llms-coming/"/>
    <summary>Whisper on Triton documentationOpenAI Whisper is a tool for
speech transcription.  It works well and has potential applications
in many different research and non-research use cases.  Using it isn’t
too hard - if you can install it and if you have a GPU.  Often, the
installing can become a big barrier, especially for “just testing”.Luckily, we have a cluster with
GPUs and a way to provide software for researchers.  We’ve made
Whisper available on the cluster as a module, so it’s trivial to
use it for any audio data you may have.  All one needs to do is:It might look complicated, but all you need to do is copy and paste.
The first words request the resources, the middle specifies your file,
and the last are some standard options to make it do things like use
our pre-downloaded model files.  Yes - this still requires knowledge
of how to use a cluster in general, but once you’ve got that
knowledge, transcribing audio is trivial.  We have a self-study
course on cluster
usage, and users can always drop by and ask us for help, for example our daily garage each
day.See the Whisper on Triton documentation for more
information on the use.We are also preparing a way to do this through the cluster web
interface Open OnDemand, which will remove
most of the need to know how a cluster works and make the tool even
more accessible to other communities.We hope to make other tools available like this.Whisper is just one of the latest tools, but you’ve probably noticed
that large language models are very popular these days.  There are, in
fact, some that can run locally on our own cluster, and our goal is to
deploy more of these so that they can be easily tested and used.  The
intention isn’t to make a replacement for existing LLM services, but
make internal for testing, research, and development use easier.Local installs have various benefits, including lower cost (since we
already own the hardware), being able to ensure reproducibility
longer-term (since models are locally downloaded and preserved), and
being able to use without various registrations.  The downside is that
the most popular ones ones aren’t available for local use.Contact us if you need other models deployed, or if you have trouble
using what’s already out there.  We are still in an early phase, and
there will probably be some difficulties in availability,
accessibility, and reusability. Contact us early if you notice
anything that’s not right.  We
both help installing things and help using them as a research
engineer partner.It’s clear that artificial intelligence and machine learning tools
will become more critical tools for other research.  The difficulty in
deploying and using them could become a barrier, and that is where
Aalto Scientific Computing comes in.  It’s our goal to make sure the
infrastructure that researchers need is ready and able to be used by
everyone, not just those with classic HPC experience.Here we go over some implementation details, which may help others
who want to deploy similar things on their own clusters.  If you just
want to use things, you don’t need to read on.We installed whisper in a container, so that all
dependencies are packaged together and things are portable.  The model
definitions themselves are not included in the container, but
mounted in.  We try to find options that allow one to specify the
model and model directory, so that the user can try out different
models without downloading each one.  The Lmod module file prints out
some help when loaded.We’ve got two versions installed: normal Whisper, and
Whisper-diarization (which can identify speakers in the transcript).Whisper and diarization both have multiple different
implementations. It’s bit of guesswork to try to see which one is the
easiest to get running / works the best (not about quality of
transcript, but easy of deployment in container and with local
models). This led to a change to another implementation of diarization
midway since the current one is more active in development and seems
overall slightly better. A lot of the work was fortunately
transferable to the new implementation.There were the common issues with getting the right dependencies in a
container and getting the GPUs to work there.  This is pretty standard
by now.Most implementations of whisper want to download models when running
it. This might make sense for general user, but doesn’t really make
sense on cluster. Depending on the implementation, getting it to use
local models is not always trivial. Since GPU execution of diarization
uses several models at once, there doesn’t seem to be a simple way to
have it use local models at all without changing the code. It also
required some sleuthing to find where exactly the models are
downloaded.  If a code uses Hugging Face, these environment variables
can be useful.Making a module that is both easy/practical to use for users without
also losing options is usually bit tricky: we want users to be able to
do anything, for “the right thing” to happen automatically, and not
build some opaque framework to make it happen.  Singularity-wrapper
fortunately helps quite a bit in doing lot of background stuff such as
binding directories, gpu flags, etc. cleanly without users having to
care about it, while still giving the option to run the container
straight through Apptainer/Singularity if finer control is necessary.Testing if the containers work is somewhat annoying. Diarization in
particular saves a lot of cache files all over the place, which all
need to be purged when testing GPU running. Otherwise the GPU will
stay idle since everything it would do is already in cache.  This also
affects clean-up after users run the code.A minor inconvenience for us (but possibly large for users) is that
the syntax for each Whisper CLI implementation tends to differ
slightly. This makes swapping between implementations slightly
annoying since you have to check every time what was the syntax for
flags.</summary>
    <published>2023-08-08T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/kickstart/</id>
    <title>SciComp Kickstart - 2023 plans and yearly strategy</title>
    <updated>2023-04-26T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="scicomp-kickstart-2023-plans-and-yearly-strategy"&gt;

&lt;p&gt;It’s time for our “kickstart course” - let’s talk about what that is,
why, and why you might want to attend.&lt;/p&gt;
&lt;p&gt;The full name is “Introduction to scientific computing and HPC”
(high-performance computing), and it used to be called “HPC Kickstart”
and was taught without the first day, thus the short name “kickstart”
we still use.  Some years day 1 had a different name, but was still
taught together with days 2-3 as a package.&lt;/p&gt;
&lt;p&gt;Our goal isn’t just to teach some skills, but to form a community
around scientific computing - with researchers who have a common
language to work together and help each other, supported by Aalto
Scientific Computing in the background.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://scicomp.aalto.fi/training/scip/kickstart-2023/"&gt;Course page in 2023&lt;/a&gt;.&lt;/p&gt;
&lt;section id="topics-of-scicomp-kickstart"&gt;
&lt;h2&gt;Topics of SciComp Kickstart&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Day 1&lt;/strong&gt; is not about high-performance computing things, but the
basic skills needed to do scientific computing: things like Linux
usage, data management, the types of tools available for different
problems.  For almost anyone doing any kind of programming/scientific
computing kind of work, regardless of background.  These kind of
skills aren’t taught in academic degree programs.  We teach these on
day 1 because otherwise, new researchers have to learn from each other
or re-invent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Days 2 and 3 are about high-performance computing, more precisely
basic cluster usage&lt;/strong&gt; (with a focus of the basics).  This is focused
on the kinds of tools our community usually uses.&lt;/p&gt;
&lt;p&gt;The topics are refined after many years of both teaching and support
of junior researchers.  Because of the way academic careers work (much
diversity of paths), these topics (even day 1) aren’t just for new
researchers but everyone can find something to learn or brush up on.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="yearly-schedule"&gt;
&lt;h2&gt;Yearly schedule&lt;/h2&gt;
&lt;p&gt;For the past years, we have been trying to keep up this yearly summer
schedule.  This usually happens the first full workweek:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monday:&lt;/strong&gt; HR introductions, other formalities for new summer
workers - many departments seem to something like this.  This may
happen early than Monday of the kickstart week, since sometimes that
comes too late.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tuesday afternoon:&lt;/strong&gt; Kickstart course day 1, the general scientific
computing introduction.  Applicable to everyone doing scientific
computing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Wednesday-Thursday afternoons:&lt;/strong&gt; The HPC cluster usage part, which fewer
people will attend compared to Tuesday.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Friday:&lt;/strong&gt; we don’t have scheduled programs on Fridays, but
sometimes there are communities who host advanced tutorials here
about what their local users need.  In 2023, there is at least an
advanced GPU course then.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="this-year-s-scheduling-conflict"&gt;
&lt;h2&gt;This year’s scheduling conflict&lt;/h2&gt;
&lt;p&gt;We are aware that there is a scheduling conflict with the CS summer day
which is scheduled on the Tuesday of the 2023 HPC kickstart course.
We did contact every department in January/February, yet this was
still a surprise to us.  In past years, we have adjusted our schedule
to similar events, but this is not possible this year despite our best
efforts.&lt;/p&gt;
&lt;p&gt;We will still try to support researchers as much as possible.
&lt;a class="reference external" href="https://www.youtube.com/&amp;#64;aaltoscientificcomputing3454"&gt;Recordings of previous years are available on youtube&lt;/a&gt;, and we
also release videos the same evening as the course precisely to
support everyone regardless of these conflicts.
Researchers can still join us for day 2 and 3 even if you did not join
day 1. However, please pay particular care to the instructions about
setting up the Triton connection in advance.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="future"&gt;
&lt;h2&gt;Future&lt;/h2&gt;
&lt;p&gt;We hope that this blog post can explain our goals to a larger audience
so that we can reach even more people in the future, so that we can
expand to onboarding even more young researchers even more
systematically.  You can reach us at &lt;a class="reference external" href="mailto:scip&amp;#37;&amp;#52;&amp;#48;aalto&amp;#46;fi"&gt;scip&lt;span&gt;&amp;#64;&lt;/span&gt;aalto&lt;span&gt;&amp;#46;&lt;/span&gt;fi&lt;/a&gt;, and
each spring we reach out to the main departments to schedule each
summer’s course.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/kickstart/"/>
    <summary>It’s time for our “kickstart course” - let’s talk about what that is,
why, and why you might want to attend.The full name is “Introduction to scientific computing and HPC”
(high-performance computing), and it used to be called “HPC Kickstart”
and was taught without the first day, thus the short name “kickstart”
we still use.  Some years day 1 had a different name, but was still
taught together with days 2-3 as a package.Our goal isn’t just to teach some skills, but to form a community
around scientific computing - with researchers who have a common
language to work together and help each other, supported by Aalto
Scientific Computing in the background.Course page in 2023.Day 1 is not about high-performance computing things, but the
basic skills needed to do scientific computing: things like Linux
usage, data management, the types of tools available for different
problems.  For almost anyone doing any kind of programming/scientific
computing kind of work, regardless of background.  These kind of
skills aren’t taught in academic degree programs.  We teach these on
day 1 because otherwise, new researchers have to learn from each other
or re-invent.Days 2 and 3 are about high-performance computing, more precisely
basic cluster usage (with a focus of the basics).  This is focused
on the kinds of tools our community usually uses.The topics are refined after many years of both teaching and support
of junior researchers.  Because of the way academic careers work (much
diversity of paths), these topics (even day 1) aren’t just for new
researchers but everyone can find something to learn or brush up on.For the past years, we have been trying to keep up this yearly summer
schedule.  This usually happens the first full workweek:Monday: HR introductions, other formalities for new summer
workers - many departments seem to something like this.  This may
happen early than Monday of the kickstart week, since sometimes that
comes too late.Tuesday afternoon: Kickstart course day 1, the general scientific
computing introduction.  Applicable to everyone doing scientific
computing.Wednesday-Thursday afternoons: The HPC cluster usage part, which fewer
people will attend compared to Tuesday.Friday: we don’t have scheduled programs on Fridays, but
sometimes there are communities who host advanced tutorials here
about what their local users need.  In 2023, there is at least an
advanced GPU course then.We are aware that there is a scheduling conflict with the CS summer day
which is scheduled on the Tuesday of the 2023 HPC kickstart course.
We did contact every department in January/February, yet this was
still a surprise to us.  In past years, we have adjusted our schedule
to similar events, but this is not possible this year despite our best
efforts.We will still try to support researchers as much as possible.
Recordings of previous years are available on youtube, and we
also release videos the same evening as the course precisely to
support everyone regardless of these conflicts.
Researchers can still join us for day 2 and 3 even if you did not join
day 1. However, please pay particular care to the instructions about
setting up the Triton connection in advance.We hope that this blog post can explain our goals to a larger audience
so that we can reach even more people in the future, so that we can
expand to onboarding even more young researchers even more
systematically.  You can reach us at scip@aalto.fi, and
each spring we reach out to the main departments to schedule each
summer’s course.</summary>
    <published>2023-04-26T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/march-development-day/</id>
    <title>ASC development day, 2023 March</title>
    <updated>2023-03-07T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="asc-development-day-2023-march"&gt;

&lt;p&gt;We recently had an internal “development day”, which is a our new name
for getting together to talk about longer term plans.  This is our
second “development day”.  Overall, it went well, and we think that we
are on an overall  good path.  There are three particular focus areas
for the future:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Teaching:&lt;/strong&gt; This was also a focus last time, and probably will
still be in the future.  We are overall happy with our decision
last time to focus less on many small/medium courses, and instead
focus on large, collaborative courses and then focused,
individualized support for advanced use cases.  Smaller courses
happen mainly when we see specific needs that can’t be filled other
ways (or we make them large, open, collaborative courses if there
is a broad need).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Triton v3:&lt;/strong&gt; The software/OS/management side of our cluster will
be almost completely reworked in the next year (we aren’t getting
rid of any hardware just for this).  This will take a fair amount
of our time, but is needed because existing systems are starting to
show their age.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;LUMI usage:&lt;/strong&gt; LUMI is a flagship project of EuroHPC and provides
huge resources available to the same people that can use Triton.
Triton is still needed for ease of use of everyday projects, but we
should actively look for people who can benefit from it and help
them port to there.  Our recent evaluations lead to the conclusion
that our porting help is still needed there.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;section id="teaching"&gt;
&lt;h2&gt;Teaching&lt;/h2&gt;
&lt;p&gt;Teaching has long been one of the pillars of ASC’s support.  It’s
still needed, but the focus seems to be changing.  No longer is a room
with 10-20 (or ever 50) people considered a lot.  People seem both
more able and willing to find advanced material themselves, and more
in need of basic principles (git, Python for SciComp, etc).  Perhaps
this is also partly caused by the remote work period emphasizing how
all this material is available online anyway.  Our basic philosophy:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Focus on large courses for new researchers&lt;/strong&gt;, for example using
the &lt;a class="reference external" href="https://coderefinery.github.io/manuals/coderefinery-mooc/"&gt;CodeRefinery MOOC strategy&lt;/a&gt;.
This reaches the most people, helps the beginners the most,
produces high-quality open source material for asynchronous
reference, and has good possibilities for co-teaching.
Example include &lt;a class="reference external" href="https://coderefinery.org"&gt;CodeRefinery&lt;/a&gt;, our &lt;a class="reference external" href="https://scicomp.aalto.fi/training/scip/kickstart-2022-summer/"&gt;SciComp/HPC kickstart course&lt;/a&gt;,
and &lt;a class="reference external" href="https://aaltoscicomp.github.io/python-for-scicomp/"&gt;Python for Scientific Computing&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Advanced, one-on-one, or small-group support&lt;/strong&gt; via &lt;a class="reference external" href="https://scicomp.aalto.fi/help/garage/"&gt;SciComp garage&lt;/a&gt; and the &lt;a class="reference external" href="https://scicomp.aalto.fi/rse/"&gt;Research
Software Engineering service&lt;/a&gt;.
This isn’t just for projects, but is also a useful service for
people learning from other advanced material in their work -
basically, we work as mentors.  One-on-one support is both more
rewarding for us and probably more useful to the user (relative to
time demands on both ends).  Anyway, advanced courses often aren’t
offered right when people need them, so we are left in this position
anyway.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What about small/medium-sized courses, and advanced courses?&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The first two points above squeeze out medium-sized courses for
the most part, in our opinion.  By the time our audience is an
intermediate or advanced level, they seem to be able to figure
things out themselves + ask for help when needed - if they can
figure out what they need to do.  This point deserves further
study, though.  Instead, we point to other existing material.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We will make sure that we have good recommendations for advanced
self-study courses and generally chart out the resources so that our
users don’t have to.  This is mostly done by our &lt;a class="reference external" href="https://hands-on.coderefinery.org"&gt;Hands-on Scientific
Computing&lt;/a&gt; course.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the past, we have supported community members to give courses on
topics of which they are experts.  Continue this as appropriate (see
the next point).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Continue the possibility of &lt;strong&gt;on-demand courses&lt;/strong&gt; taught by us if
someone requests them, and other smaller courses if we see a strong
need.  Contact us!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="triton-v3"&gt;
&lt;h2&gt;Triton v3&lt;/h2&gt;
&lt;p&gt;Triton is our HPC cluster, and is notable for being a Ship of Theseus:
it’s continually upgraded while being the same cluster.  This has
resulted in the software running it getting a bit out of date.  This
software was originally developed as broader partnerships, and as
these partnerships have changed, we need to take more responsibility
for it ourselves.&lt;/p&gt;
&lt;p&gt;Users shouldn’t see any major change from this, though part of it is
improving our (user) software installation tools, which should make
increased responsiveness to software installation requests.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="lumi"&gt;
&lt;h2&gt;LUMI&lt;/h2&gt;
&lt;p&gt;As said above, Lumi is a significant resource, yet our users have not
come to us asking for our help in using it. Over the past six months, we
have found some Triton users who would benefit from it and helped
extend their workflows to work on LUMI.  We do this by first testing
some applications ourselves, then looking at Triton usage for large
users and reaching out directly.&lt;/p&gt;
&lt;p&gt;Currently our focus is on GPU-intensive applications, which is made
more interesting because LUMI has AMD GPUs.  We’ve gotten local AMD
GPUs for our own testing and in general are well prepared to support
this.&lt;/p&gt;
&lt;p&gt;While LUMI is a HPC system and has a typical HPC system interface, it
serves so many different users that the software stack is very
limited, so that most users need to install their own software and
figure out how to run it on AMD GPUs.  This is why we recommend most users
access LUMI through us (we’re paid to save you time, after all), though
of course anyone interested can use it directly.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/march-development-day/"/>
    <summary>We recently had an internal “development day”, which is a our new name
for getting together to talk about longer term plans.  This is our
second “development day”.  Overall, it went well, and we think that we
are on an overall  good path.  There are three particular focus areas
for the future:Teaching: This was also a focus last time, and probably will
still be in the future.  We are overall happy with our decision
last time to focus less on many small/medium courses, and instead
focus on large, collaborative courses and then focused,
individualized support for advanced use cases.  Smaller courses
happen mainly when we see specific needs that can’t be filled other
ways (or we make them large, open, collaborative courses if there
is a broad need).Triton v3: The software/OS/management side of our cluster will
be almost completely reworked in the next year (we aren’t getting
rid of any hardware just for this).  This will take a fair amount
of our time, but is needed because existing systems are starting to
show their age.LUMI usage: LUMI is a flagship project of EuroHPC and provides
huge resources available to the same people that can use Triton.
Triton is still needed for ease of use of everyday projects, but we
should actively look for people who can benefit from it and help
them port to there.  Our recent evaluations lead to the conclusion
that our porting help is still needed there.Teaching has long been one of the pillars of ASC’s support.  It’s
still needed, but the focus seems to be changing.  No longer is a room
with 10-20 (or ever 50) people considered a lot.  People seem both
more able and willing to find advanced material themselves, and more
in need of basic principles (git, Python for SciComp, etc).  Perhaps
this is also partly caused by the remote work period emphasizing how
all this material is available online anyway.  Our basic philosophy:Focus on large courses for new researchers, for example using
the CodeRefinery MOOC strategy.
This reaches the most people, helps the beginners the most,
produces high-quality open source material for asynchronous
reference, and has good possibilities for co-teaching.
Example include CodeRefinery, our SciComp/HPC kickstart course,
and Python for Scientific Computing.Advanced, one-on-one, or small-group support via SciComp garage and the Research
Software Engineering service.
This isn’t just for projects, but is also a useful service for
people learning from other advanced material in their work -
basically, we work as mentors.  One-on-one support is both more
rewarding for us and probably more useful to the user (relative to
time demands on both ends).  Anyway, advanced courses often aren’t
offered right when people need them, so we are left in this position
anyway.What about small/medium-sized courses, and advanced courses?The first two points above squeeze out medium-sized courses for
the most part, in our opinion.  By the time our audience is an
intermediate or advanced level, they seem to be able to figure
things out themselves + ask for help when needed - if they can
figure out what they need to do.  This point deserves further
study, though.  Instead, we point to other existing material.We will make sure that we have good recommendations for advanced
self-study courses and generally chart out the resources so that our
users don’t have to.  This is mostly done by our Hands-on Scientific
Computing course.In the past, we have supported community members to give courses on
topics of which they are experts.  Continue this as appropriate (see
the next point).Continue the possibility of on-demand courses taught by us if
someone requests them, and other smaller courses if we see a strong
need.  Contact us!Triton is our HPC cluster, and is notable for being a Ship of Theseus:
it’s continually upgraded while being the same cluster.  This has
resulted in the software running it getting a bit out of date.  This
software was originally developed as broader partnerships, and as
these partnerships have changed, we need to take more responsibility
for it ourselves.Users shouldn’t see any major change from this, though part of it is
improving our (user) software installation tools, which should make
increased responsiveness to software installation requests.As said above, Lumi is a significant resource, yet our users have not
come to us asking for our help in using it. Over the past six months, we
have found some Triton users who would benefit from it and helped
extend their workflows to work on LUMI.  We do this by first testing
some applications ourselves, then looking at Triton usage for large
users and reaching out directly.Currently our focus is on GPU-intensive applications, which is made
more interesting because LUMI has AMD GPUs.  We’ve gotten local AMD
GPUs for our own testing and in general are well prepared to support
this.While LUMI is a HPC system and has a typical HPC system interface, it
serves so many different users that the software stack is very
limited, so that most users need to install their own software and
figure out how to run it on AMD GPUs.  This is why we recommend most users
access LUMI through us (we’re paid to save you time, after all), though
of course anyone interested can use it directly.</summary>
    <published>2023-03-07T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2023/stickers/</id>
    <title>Aalto SciComp stickers and patches</title>
    <updated>2023-02-20T00:00:00+00:00</updated>
    <author>
      <name>Richard Darst</name>
    </author>
    <content type="html">&lt;section id="aalto-scicomp-stickers-and-patches"&gt;

&lt;p&gt;We have stickers (and patches!) to support Aalto Scientific Computing.
(You can get them from our IT offices in CS, NBE, and Physics) But why
invest in this?  Well, it’s fun, but there should be a deeper reason.&lt;/p&gt;
&lt;figure class="align-default" id="id1"&gt;
&lt;img alt="Picture with hexagonal stickers and patches laid out on a table.  We have far more than you see here." src="https://aaltoscicomp.github.io/blog/_images/stickers-and-patches.jpg" /&gt;
&lt;figcaption&gt;
&lt;p&gt;&lt;span class="caption-text"&gt;Stickers and patches, pick up from either the Physics, Neuroscience
and Biomedical Engineering, or Computer Science departments.&lt;/span&gt;&lt;/p&gt;
&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;While our main goal is to maintain Aalto University Triton HPC cluster,
provide courses and direct support to researchers, we cannot scale to
solve all problems and make the best decisions without a community: you!
Thus, our new promotional material is designed so that the members of our
community can show their support for scientific computing at Aalto
University.  We hope that by providing a way for the community to show
this interest, people can find - and support - each other better.&lt;/p&gt;
&lt;p&gt;We have the typical hexagonal stickers, which you can use on all the typical
sticker things.&lt;/p&gt;
&lt;p&gt;We also have patches, for those who are interested - in Finland they
are a big thing on [student
overalls](&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Student_boilersuit"&gt;https://en.wikipedia.org/wiki/Student_boilersuit&lt;/a&gt;), but you
could also sew them on your backpack or purse. Please send us pictures to
inspire us all! (some have Velcro backing for that kind)
of attachment, ask us for that style.&lt;/p&gt;
&lt;section id="black-background-vs-white-background"&gt;
&lt;h2&gt;Black background vs white background?&lt;/h2&gt;
&lt;p&gt;You may notice that for the patches some have a black background and
some have a white background.  &lt;strong&gt;Black-background means “Ask me
anything about the tools of scientific computing, I am happy to
help or at least point you the right direction (as much as I can)!”&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Here’s our idea:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Anyone may take the white background ones&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Black background is for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Aalto Scientific Computing team staff&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Volunteers at our events (for example helpers at our workshops)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Anyone who is interested in using their time to help others in
scientific computing (regardless of their skills)&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(clever people will notice that the first two are included in the
third, and actually anyone can be the third if they want).&lt;/p&gt;
&lt;p&gt;The idea is that we, and our community, can’t work alone.  Everyone
needs to support each other in order to work at the level we want.
The in-group experts are an undervalued resource in this, often not
getting the credit or recognition they deserve in supporting
everyone.  This is our small method of recognizing those supporters,
and we hope that in the future we support them ever more - both
career-wise and supporting them in supporting others.&lt;/p&gt;
&lt;p&gt;Yes, we should have gotten black-background stickers.  We’ll do that
next time…&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2023/stickers/"/>
    <summary>We have stickers (and patches!) to support Aalto Scientific Computing.
(You can get them from our IT offices in CS, NBE, and Physics) But why
invest in this?  Well, it’s fun, but there should be a deeper reason.While our main goal is to maintain Aalto University Triton HPC cluster,
provide courses and direct support to researchers, we cannot scale to
solve all problems and make the best decisions without a community: you!
Thus, our new promotional material is designed so that the members of our
community can show their support for scientific computing at Aalto
University.  We hope that by providing a way for the community to show
this interest, people can find - and support - each other better.We have the typical hexagonal stickers, which you can use on all the typical
sticker things.We also have patches, for those who are interested - in Finland they
are a big thing on [student
overalls](https://en.wikipedia.org/wiki/Student_boilersuit), but you
could also sew them on your backpack or purse. Please send us pictures to
inspire us all! (some have Velcro backing for that kind)
of attachment, ask us for that style.You may notice that for the patches some have a black background and
some have a white background.  Black-background means “Ask me
anything about the tools of scientific computing, I am happy to
help or at least point you the right direction (as much as I can)!”Here’s our idea:Anyone may take the white background onesBlack background is for:Aalto Scientific Computing team staffVolunteers at our events (for example helpers at our workshops)Anyone who is interested in using their time to help others in
scientific computing (regardless of their skills)(clever people will notice that the first two are included in the
third, and actually anyone can be the third if they want).The idea is that we, and our community, can’t work alone.  Everyone
needs to support each other in order to work at the level we want.
The in-group experts are an undervalued resource in this, often not
getting the credit or recognition they deserve in supporting
everyone.  This is our small method of recognizing those supporters,
and we hope that in the future we support them ever more - both
career-wise and supporting them in supporting others.Yes, we should have gotten black-background stickers.  We’ll do that
next time…</summary>
    <published>2023-02-20T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://aaltoscicomp.github.io/blog/2021/04/what-code-has-to-teach-us-1/</id>
    <title>What code has to teach us #1: the impact of implicit behavior</title>
    <updated>2021-04-14T00:00:00+00:00</updated>
    <author>
      <name>Marijn van Vliet</name>
    </author>
    <content type="html">&lt;section id="what-code-has-to-teach-us-1-the-impact-of-implicit-behavior"&gt;

&lt;blockquote&gt;
&lt;div&gt;&lt;div class="line-block"&gt;
&lt;div class="line"&gt;“The master has failed more times than the beginner has even tried”&lt;/div&gt;
&lt;div class="line"&gt;– Stephen McCranie&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;As Research Software Engineers (RSEs), we read and write a lot of code.
In this series of blog posts, we are going to share some snippets that taught us important lessons, and thereby impart that wisdom unto you.
These snippets are taken from actual research code, responsible for producing results that end up in peer-reviewed scientific articles.
That is to say, results that we should have some confidence in to be correct.
However, problems have a way of cropping up in the most unexpected places and when they do, there is a chance to learn from them.&lt;/p&gt;
&lt;section id="the-impact-of-implicit-behavior"&gt;
&lt;h2&gt;The impact of implicit behavior&lt;/h2&gt;
&lt;p&gt;I was in the metro zooming through Lauttasaari when I received an email from my professor that made my heart skip a beat.
We just submitted a paper to Nature Communications and were all still a little giddy about finally sending off the project we had been working on for 3 years.
She and the first author had been chatting about the cool methods we had been using for the project and a question arose: were we 100% certain that we “removed copies of the selected stimuli from the train set”?
If we hadn’t, we would have to quickly pull back our submission, but surely we had, right?
I thought we did.
At least, I distinctly remember writing the code to do it.
Just to be on the safe side, I decided to double check the code.&lt;/p&gt;
&lt;p&gt;Below is the analysis script in question.
It reads some data, performs some preprocessing, feeds into the a machine learning algorithm called &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;zero_shot_decoding&lt;/span&gt;&lt;/code&gt;, and stores the output.
I present it here to you in full, because there are many subtleties working together that make this situation so scary.
The question I pose to you, dear reader, is this: were the highlighted lines (118–120) executed, or did we have to pull our submission?&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="linenos"&gt;  1&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;numpy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="linenos"&gt;  2&lt;/span&gt; &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;scipy.io&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;loadmat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;savemat&lt;/span&gt;
&lt;span class="linenos"&gt;  3&lt;/span&gt; &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;scipy.stats&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;zscore&lt;/span&gt;
&lt;span class="linenos"&gt;  4&lt;/span&gt; &lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;zero_shot_decoding&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;zero_shot_decoding&lt;/span&gt;
&lt;span class="linenos"&gt;  5&lt;/span&gt; &lt;span class="c1"&gt;#print(&amp;#39;Code version:&amp;#39;+ subprocess.check_output([&amp;#39;git&amp;#39;, &amp;#39;rev-parse&amp;#39;, &amp;#39;HEAD&amp;#39;]))&lt;/span&gt;
&lt;span class="linenos"&gt;  6&lt;/span&gt;
&lt;span class="linenos"&gt;  7&lt;/span&gt; &lt;span class="c1"&gt;# Default location of the norm data (see also the --norms command line parameter)&lt;/span&gt;
&lt;span class="linenos"&gt;  8&lt;/span&gt; &lt;span class="n"&gt;norm_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;../data/corpusvectors_ginter_lemma.mat&amp;#39;&lt;/span&gt;
&lt;span class="linenos"&gt;  9&lt;/span&gt;
&lt;span class="linenos"&gt; 10&lt;/span&gt; &lt;span class="c1"&gt;# Handle command line arguments&lt;/span&gt;
&lt;span class="linenos"&gt; 11&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Run zero-shot learning on a single subject.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 12&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;input_file&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 13&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The file that contains the subject data; should be a .mat file.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 14&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-s&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;--subject-id&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metavar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Subject ID&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 15&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The subject-id (as string). This number is recorded in the output .mat file.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 16&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--norms&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metavar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;filename&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;norm_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 17&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The file that contains the norm data. Defaults to &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s1"&gt;.&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;norm_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 18&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-o&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;--output&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metavar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;filename&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;results.mat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 19&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The file to write the results to; should end in .mat. Defaults to results.mat&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 20&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-v&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;--verbose&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;store_true&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 21&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Whether to show a progress bar&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 22&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-b&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;--break-after&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metavar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 23&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Break after N iterations (useful for testing)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 24&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;--n_voxels&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metavar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;N voxels&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 25&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Number of voxels. Used only for results file name.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 26&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;-d&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;--distance-metric&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;cosine&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt; 27&lt;/span&gt;                     &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;The distance metric to use. Any distance implemented in SciPy&amp;#39;s &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 28&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;spatial.distance module is supported. See the docstring of &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 29&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;scipy.spatial.distance.pdict for the exhaustive list of possitble &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 30&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;metrics. Here are some of the more useful ones: &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 31&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;&amp;#39;euclidean&amp;#39; - Euclidean distance &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 32&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;&amp;#39;sqeuclidean&amp;#39; - Squared euclidean distance &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 33&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;&amp;#39;correlation&amp;#39; - Pearson correlation &amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt; 34&lt;/span&gt;                           &lt;span class="s2"&gt;&amp;quot;&amp;#39;cosine&amp;#39; - Cosine similarity (the default)&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="linenos"&gt; 35&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="linenos"&gt; 36&lt;/span&gt;
&lt;span class="linenos"&gt; 37&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;
&lt;span class="linenos"&gt; 38&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;break_after&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt; 39&lt;/span&gt;     &lt;span class="n"&gt;break_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;break_after&lt;/span&gt;
&lt;span class="linenos"&gt; 40&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt; 41&lt;/span&gt;     &lt;span class="n"&gt;break_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="linenos"&gt; 42&lt;/span&gt;
&lt;span class="linenos"&gt; 43&lt;/span&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Subject:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subject_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 44&lt;/span&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Input:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 45&lt;/span&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Output:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 46&lt;/span&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Norms:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 47&lt;/span&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Distance metric:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distance_metric&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 48&lt;/span&gt;
&lt;span class="linenos"&gt; 49&lt;/span&gt;
&lt;span class="linenos"&gt; 50&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loadmat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 51&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;brainVecsReps&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt; 52&lt;/span&gt;     &lt;span class="c1"&gt;# File without stability selection enabled&lt;/span&gt;
&lt;span class="linenos"&gt; 53&lt;/span&gt;     &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Stability selection DISABLED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 54&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;brainVecsReps&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;brainVecsReps&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
&lt;span class="linenos"&gt; 55&lt;/span&gt;     &lt;span class="n"&gt;n_repetitions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_voxels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;
&lt;span class="linenos"&gt; 56&lt;/span&gt;     &lt;span class="n"&gt;voxel_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="linenos"&gt; 57&lt;/span&gt;
&lt;span class="linenos"&gt; 58&lt;/span&gt;     &lt;span class="c1"&gt;# Drop all voxels that contain NaN&amp;#39;s for any items&lt;/span&gt;
&lt;span class="linenos"&gt; 59&lt;/span&gt;     &lt;span class="n"&gt;non_nan_mask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isnan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 60&lt;/span&gt;     &lt;span class="n"&gt;non_nan_indices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flatnonzero&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;non_nan_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 61&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;non_nan_mask&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 62&lt;/span&gt;
&lt;span class="linenos"&gt; 63&lt;/span&gt;     &lt;span class="c1"&gt;# Normalize betas across items&lt;/span&gt;
&lt;span class="linenos"&gt; 64&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zscore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ddof&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 65&lt;/span&gt;
&lt;span class="linenos"&gt; 66&lt;/span&gt;     &lt;span class="c1"&gt;# Average over the repetitions&lt;/span&gt;
&lt;span class="linenos"&gt; 67&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 68&lt;/span&gt;
&lt;span class="linenos"&gt; 69&lt;/span&gt;     &lt;span class="n"&gt;X_perm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="linenos"&gt; 70&lt;/span&gt;     &lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="linenos"&gt; 71&lt;/span&gt;
&lt;span class="linenos"&gt; 72&lt;/span&gt; &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;mask_voxels&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt; 73&lt;/span&gt;     &lt;span class="c1"&gt;# File without stability selection enabled&lt;/span&gt;
&lt;span class="linenos"&gt; 74&lt;/span&gt;     &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Stability selection DISABLED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 75&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;mask_voxels&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 76&lt;/span&gt;     &lt;span class="n"&gt;voxel_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;voxel_ids&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 77&lt;/span&gt;     &lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_voxels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;
&lt;span class="linenos"&gt; 78&lt;/span&gt;     &lt;span class="n"&gt;X_perm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="linenos"&gt; 79&lt;/span&gt;     &lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="linenos"&gt; 80&lt;/span&gt;
&lt;span class="linenos"&gt; 81&lt;/span&gt; &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;top_voxels_perm&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt; 82&lt;/span&gt;     &lt;span class="c1"&gt;# File with stability selection enabled&lt;/span&gt;
&lt;span class="linenos"&gt; 83&lt;/span&gt;     &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Stability selection ENABLED&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 84&lt;/span&gt;     &lt;span class="n"&gt;X_perm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;top_voxels_perm&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 85&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;top_voxels_all&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 86&lt;/span&gt;     &lt;span class="n"&gt;voxel_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;top_voxel_ids&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 87&lt;/span&gt;     &lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_voxels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X_perm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;
&lt;span class="linenos"&gt; 88&lt;/span&gt;
&lt;span class="linenos"&gt; 89&lt;/span&gt;     &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;leave2out_index.npy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 90&lt;/span&gt;     &lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;leave2out_index.npy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 91&lt;/span&gt;
&lt;span class="linenos"&gt; 92&lt;/span&gt; &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;brainVecs&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt; 93&lt;/span&gt;     &lt;span class="c1"&gt;# File with single-trial data&lt;/span&gt;
&lt;span class="linenos"&gt; 94&lt;/span&gt;     &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Stability selection DISABLED, single-trial data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt; 95&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;brainVecs&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 96&lt;/span&gt;     &lt;span class="n"&gt;voxel_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;voxindex&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt; 97&lt;/span&gt;     &lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_voxels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;
&lt;span class="linenos"&gt; 98&lt;/span&gt;     &lt;span class="n"&gt;X_perm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="linenos"&gt; 99&lt;/span&gt;
&lt;span class="linenos"&gt;100&lt;/span&gt;     &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;generate_splits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="linenos"&gt;101&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;Generate train-set, test-set splits.&lt;/span&gt;
&lt;span class="linenos"&gt;102&lt;/span&gt;
&lt;span class="linenos"&gt;103&lt;/span&gt;&lt;span class="sd"&gt;         To save computation time, we don&amp;#39;t do the full 360*359/2 iterations.&lt;/span&gt;
&lt;span class="linenos"&gt;104&lt;/span&gt;&lt;span class="sd"&gt;         Instead we will do the leave-2-out scheme block-wise and use the rest&lt;/span&gt;
&lt;span class="linenos"&gt;105&lt;/span&gt;&lt;span class="sd"&gt;         of the data for training.&lt;/span&gt;
&lt;span class="linenos"&gt;106&lt;/span&gt;&lt;span class="sd"&gt;         &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="linenos"&gt;107&lt;/span&gt;         &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;n_stimuli&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="linenos"&gt;108&lt;/span&gt;         &lt;span class="n"&gt;n_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n_stimuli&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt;
&lt;span class="linenos"&gt;109&lt;/span&gt;         &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="linenos"&gt;110&lt;/span&gt;             &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="linenos"&gt;111&lt;/span&gt;                 &lt;span class="c1"&gt;# Don&amp;#39;t make the model distinguish between duplicate stimuli&lt;/span&gt;
&lt;span class="linenos"&gt;112&lt;/span&gt;                 &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt;113&lt;/span&gt;                     &lt;span class="k"&gt;continue&lt;/span&gt;
&lt;span class="linenos"&gt;114&lt;/span&gt;
&lt;span class="linenos"&gt;115&lt;/span&gt;                 &lt;span class="n"&gt;test_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt;116&lt;/span&gt;                 &lt;span class="n"&gt;train_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setdiff1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;test_set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;117&lt;/span&gt;
&lt;span class="hll"&gt;&lt;span class="linenos"&gt;118&lt;/span&gt;                 &lt;span class="c1"&gt;# Remove copies of the selected stimuli from the train set&lt;/span&gt;
&lt;/span&gt;&lt;span class="hll"&gt;&lt;span class="linenos"&gt;119&lt;/span&gt;                 &lt;span class="n"&gt;train_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setdiff1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_blocks&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/span&gt;&lt;span class="hll"&gt;&lt;span class="linenos"&gt;120&lt;/span&gt;                 &lt;span class="n"&gt;train_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setdiff1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;block_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_blocks&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/span&gt;&lt;span class="linenos"&gt;121&lt;/span&gt;
&lt;span class="linenos"&gt;122&lt;/span&gt;                 &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;train_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_set&lt;/span&gt;
&lt;span class="linenos"&gt;123&lt;/span&gt;
&lt;span class="linenos"&gt;124&lt;/span&gt;     &lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_splits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_stimuli&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;125&lt;/span&gt;
&lt;span class="linenos"&gt;126&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="linenos"&gt;127&lt;/span&gt;     &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Could not find any suitable data in the supplied input file.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;128&lt;/span&gt;
&lt;span class="linenos"&gt;129&lt;/span&gt; &lt;span class="c1"&gt;# Load the norm data&lt;/span&gt;
&lt;span class="linenos"&gt;130&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loadmat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;131&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;newVectors&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="linenos"&gt;132&lt;/span&gt;
&lt;span class="linenos"&gt;133&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isfinite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;span class="linenos"&gt;134&lt;/span&gt;     &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The norm data contains NaNs or Infs.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;135&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isfinite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;span class="linenos"&gt;136&lt;/span&gt;     &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The brain data contains NaNs or Infs.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;137&lt;/span&gt;
&lt;span class="linenos"&gt;138&lt;/span&gt; &lt;span class="n"&gt;pairwise_accuracies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predicted_y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zero_shot_decoding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="linenos"&gt;139&lt;/span&gt;     &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_perm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;break_after&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;break_after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distance_metric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv_splits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt;
&lt;span class="linenos"&gt;140&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="linenos"&gt;141&lt;/span&gt;
&lt;span class="linenos"&gt;142&lt;/span&gt; &lt;span class="n"&gt;savemat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="linenos"&gt;143&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;pairwise_accuracies&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pairwise_accuracies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;144&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;weights&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coef_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;145&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;feat_scores&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;target_scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;146&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;subject&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subject_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;147&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;inputfile&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;148&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;alphas&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alpha_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;149&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;voxel_ids&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;voxel_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;150&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;predicted_y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;predicted_y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;151&lt;/span&gt;     &lt;span class="s1"&gt;&amp;#39;patterns&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="linenos"&gt;152&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;section id="lessons-this-code-has-to-teach-us"&gt;
&lt;h2&gt;Lessons this code has to teach us&lt;/h2&gt;
&lt;p&gt;The first thing that went through my head, as it probably went through yours, was: this code is so long and complicated, answering this seemingly simple question is going to take some time to figure out.
And I won’t blame you for giving up right then and there.
Hunched over my laptop while the metro passed through Ruoholahti, I tried to trace the logic of the script.&lt;/p&gt;
&lt;p&gt;First problem: much of the behavior of the script is dictated by the command line arguments.
Luckily, their values are saved in the output file, so I could check that they were correct.&lt;/p&gt;
&lt;div class="admonition note"&gt;
&lt;p class="admonition-title"&gt;Note&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; always error on the side of caution when deciding whether it is worth storing something in the result file.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;That brings us to the big &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;if&lt;/span&gt;&lt;/code&gt;-statement.
Did the correct branch execute?
Well, that depends on what was in the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;m&lt;/span&gt;&lt;/code&gt; dictionary, which translates to what variables were defined in the MATLAB file used as input to the script.
If we had used the wrong variable name, i.e. &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;brainVecsReps&lt;/span&gt;&lt;/code&gt; instead of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;brainVecs&lt;/span&gt;&lt;/code&gt;, when creating the input file, the wrong branch would have executed and the script would have been happily computing the wrong thing.
And we would never know.
If we had used the wrong input file, or the wrong version of the input file, the wrong branch would have executed without any indication that something was wrong.
So many opportunities for small mistakes to lead to a big error.&lt;/p&gt;
&lt;div class="admonition note"&gt;
&lt;p class="admonition-title"&gt;Note&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; have the user be explicit in what they want to do, so the script can check the user’s intent against the inputs and raise a nice big error if they screwed up.
In this case, there should really have been either a command line parameter determining which branch to execute, or even better, this should have been four separate scripts.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In the end I ended up searching the logfile for the line &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Stability&lt;/span&gt; &lt;span class="pre"&gt;selection&lt;/span&gt; &lt;span class="pre"&gt;DISABLED,&lt;/span&gt; &lt;span class="pre"&gt;single-trial&lt;/span&gt; &lt;span class="pre"&gt;data&lt;/span&gt;&lt;/code&gt; which, thankfully, was there, so the correct branch did execute.&lt;/p&gt;
&lt;div class="admonition note"&gt;
&lt;p class="admonition-title"&gt;Note&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; be liberal with &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;print&lt;/span&gt;&lt;/code&gt;-statements (or other logging directives) in your scripts; cherish the resulting logfiles.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;I breathed a sigh of relieved as the metro pulled into the central railway station.&lt;/p&gt;
&lt;p&gt;This &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;if&lt;/span&gt;&lt;/code&gt;-statement is a work of insanity.
What was I thinking determining what the script should be doing based on a mostly random naming scheme of some variables in a MATLAB file?
I got lucky that time.
But from that moment on, I would heed this lesson:&lt;/p&gt;
&lt;div class="admonition note"&gt;
&lt;p class="admonition-title"&gt;Note&lt;/p&gt;
&lt;div class="line-block"&gt;
&lt;div class="line"&gt;Explicit is better than implicit.&lt;/div&gt;
&lt;div class="line"&gt;– The Zen of Python, by Tim Peters&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://aaltoscicomp.github.io/blog/2021/04/what-code-has-to-teach-us-1/"/>
    <summary>As Research Software Engineers (RSEs), we read and write a lot of code.
In this series of blog posts, we are going to share some snippets that taught us important lessons, and thereby impart that wisdom unto you.
These snippets are taken from actual research code, responsible for producing results that end up in peer-reviewed scientific articles.
That is to say, results that we should have some confidence in to be correct.
However, problems have a way of cropping up in the most unexpected places and when they do, there is a chance to learn from them.I was in the metro zooming through Lauttasaari when I received an email from my professor that made my heart skip a beat.
We just submitted a paper to Nature Communications and were all still a little giddy about finally sending off the project we had been working on for 3 years.
She and the first author had been chatting about the cool methods we had been using for the project and a question arose: were we 100% certain that we “removed copies of the selected stimuli from the train set”?
If we hadn’t, we would have to quickly pull back our submission, but surely we had, right?
I thought we did.
At least, I distinctly remember writing the code to do it.
Just to be on the safe side, I decided to double check the code.Below is the analysis script in question.
It reads some data, performs some preprocessing, feeds into the a machine learning algorithm called zero_shot_decoding, and stores the output.
I present it here to you in full, because there are many subtleties working together that make this situation so scary.
The question I pose to you, dear reader, is this: were the highlighted lines (118–120) executed, or did we have to pull our submission?The first thing that went through my head, as it probably went through yours, was: this code is so long and complicated, answering this seemingly simple question is going to take some time to figure out.
And I won’t blame you for giving up right then and there.
Hunched over my laptop while the metro passed through Ruoholahti, I tried to trace the logic of the script.First problem: much of the behavior of the script is dictated by the command line arguments.
Luckily, their values are saved in the output file, so I could check that they were correct.Lesson: always error on the side of caution when deciding whether it is worth storing something in the result file.That brings us to the big if-statement.
Did the correct branch execute?
Well, that depends on what was in the m dictionary, which translates to what variables were defined in the MATLAB file used as input to the script.
If we had used the wrong variable name, i.e. brainVecsReps instead of brainVecs, when creating the input file, the wrong branch would have executed and the script would have been happily computing the wrong thing.
And we would never know.
If we had used the wrong input file, or the wrong version of the input file, the wrong branch would have executed without any indication that something was wrong.
So many opportunities for small mistakes to lead to a big error.Lesson: have the user be explicit in what they want to do, so the script can check the user’s intent against the inputs and raise a nice big error if they screwed up.
In this case, there should really have been either a command line parameter determining which branch to execute, or even better, this should have been four separate scripts.In the end I ended up searching the logfile for the line Stability selection DISABLED, single-trial data which, thankfully, was there, so the correct branch did execute.Lesson: be liberal with print-statements (or other logging directives) in your scripts; cherish the resulting logfiles.I breathed a sigh of relieved as the metro pulled into the central railway station.This if-statement is a work of insanity.
What was I thinking determining what the script should be doing based on a mostly random naming scheme of some variables in a MATLAB file?
I got lucky that time.
But from that moment on, I would heed this lesson:</summary>
    <published>2021-04-14T00:00:00+00:00</published>
  </entry>
</feed>
