1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml">
7 <title>What's new in the Linux kernel - DebConf 2014</title>
9 <meta name="generator" content="S5" />
10 <meta name="version" content="S5 1.1" />
11 <meta name="author" content="Ben Hutchings" />
12 <!-- configuration parameters -->
13 <meta name="defaultView" content="slideshow" />
14 <meta name="controlVis" content="hidden" />
15 <!-- style sheet links -->
16 <link rel="stylesheet" href="s5-blank/ui/default/slides.css" type="text/css" media="projection" id="slideProj" />
17 <link rel="stylesheet" href="s5-blank/ui/default/outline.css" type="text/css" media="screen" id="outlineStyle" />
18 <link rel="stylesheet" href="s5-blank/ui/default/print.css" type="text/css" media="print" id="slidePrint" />
19 <link rel="stylesheet" href="s5-blank/ui/default/opera.css" type="text/css" media="projection" id="operaFix" />
20 <style type="text/css">
21 .logo { position: absolute; right: 0; top: 0; height: 100% }
22 table { border-collapse: collapse }
23 th { border-bottom: 2pt solid black }
24 th, td { padding: 0 6pt }
25 .package { font-family: monospace }
26 var { font-family: sans }
28 <style type="text/css" media="print">
29 .slide { page-break-after: always }
32 <script src="s5-blank/ui/default/slides.js" type="text/javascript"></script>
37 <div id="controls"><!-- DO NOT EDIT --></div>
38 <div id="currentSlide"><!-- DO NOT EDIT --></div>
43 <h2>What's new in the Linux kernel</h2>
48 <div class="presentation">
51 <h1>What's new in the Linux kernel</h1>
52 <object data="tux-debian.svg" width="35%" align="right"></object>
53 <h2>and what's missing in Debian</h2>
54 <h3>Ben Hutchings</h3>
58 <h1>Ben Hutchings</h1>
61 Professional software engineer by day, Debian developer by night
62 (or sometimes the other way round)
65 Regular Linux contributor in both roles since 2008
68 Working on various drivers and kernel code in my day job
71 Debian kernel team member, now doing most of the unstable
72 maintenance aside from ports
75 Maintaining Linux 3.2.<var>y</var> stable update series on
82 <h1>Linux releases early and often</h1>
83 <ul class="incremental">
85 Linux is released about 5 times a year (plus stable updates
89 ...though some features aren't ready to use when they first
95 Since my talk last year, Linus has made 6 releases (3.11-3.16)
98 Good news: we have lots of new kernel features in testing/unstable
101 Bad news: some of them won't really work without new userland
107 <h1>Recap of last year's features (1)</h1>
108 <ul class="incremental">
110 Team device driver: userland package (libteam) was uploaded in
114 Transcendent memory: frontswap, zswap and Xen tmem will be
115 enabled in next kernel upload
118 New KMS drivers: should all work with current Xorg drivers
121 Module signing: still not enabled, but probably will be if we
128 <h1>Recap of last year's features (2)</h1>
129 <ul class="incremental">
131 More support for discard: still not enabled at install time
132 (<a href="https://bugs.debian.org/690977">#690977</a>)
135 More support for containers: XFS was fixed, and user namespaces
139 bcache: userland package (bcache-tools) still not quite ready
140 (<a href="https://bugs.debian.org/708132">#708132</a>)
143 ARMv7 multiplatform: d-i works on <em>some</em> platforms but
144 I'm still not sure which. Some progress on GPU drivers, but not
151 <h1>Unnamed temporary files [3.11]</h1>
154 Open directory with option <tt>O_TMPFILE</tt> to create an
155 unnamed temporary file on that filesystem
158 As with <tt>tmpfile()</tt>, the file disappears on
159 last <tt>close()</tt>
162 File can be linked into the filesystem using
163 <tt>linkat(..., AT_EMPTY_PATH)</tt>, allowing for 'atomic'
164 creation of file with complete contents and metadata
167 Not supported on all filesystem types, so you will usually need
174 <h1>Network busy-polling [3.11] (1)</h1>
175 <p>A conventional network request/response process looks like:</p>
177 <ol class="incremental">
179 Task calls <tt>send()</tt>; network stack constructs a
180 packet; driver adds it to hardware Tx queue
183 Task calls <tt>poll()</tt> or <tt>recv()</tt>, which blocks;
184 kernel puts it to sleep and possibly idles the CPU
187 Network adapter receives response and generates IRQ, waking
191 Driver's IRQ handler schedules polling of the hardware Rx
195 Kernel runs the driver's NAPI poll function, which passes
196 the response packet into the network stack
199 Network stack decodes packet headers and adds packet to
203 Network stack wakes up sleeping task; scheduler switches
204 to it and the socket call returns
211 <h1>Network busy-polling [3.11] (2)</h1>
212 <ul class="incremental">
214 If driver supports busy-polling, it tags each packet with
215 the receiving NAPI context, and kernel tags sockets
218 When busy-polling is enabled, <tt>poll()</tt>
219 and <tt>recv()</tt> call the driver's busy poll function to
220 check for packets synchronously (up to some time limit)
223 If the response usually arrives quickly, this reduces overall
224 request/response latency as there are no context switches and
228 Time limit set by sysctl (<tt>net.busy_poll</tt>,
229 <tt>net.busy_read</tt>) or socket option (<tt>SOL_SOCKET,
230 SO_BUSY_POLL</tt>); requires tuning
236 <h1>Lustre filesystem [3.12]</h1>
239 A distributed filesystem, popular for cluster computing
243 Developed out-of-tree since 1999, but now added to Linux staging
247 Was included in squeeze but dropped from wheezy as it didn't
251 Userland is now missing from Debian
257 <h1>Btrfs offline dedupe [3.12]</h1>
258 <ul class="incremental">
260 Btrfs generally copies and frees blocks, rather than updating
264 This allows snapshots and file copies to copy-by-reference,
265 deferring the real copying until changes are made
268 Filesystems may still end up with multiple copies of the same
272 Btrfs doesn't actively merge these duplicates, but userland can
276 Many file dedupe tools are packaged for Debian, but not one that
277 works with this Btrfs feature, e.g. bedup
283 <h1>nftables [3.13]</h1>
284 <ul class="incremental">
286 Linux has several firewall APIs - iptables, ip6tables, arptables
290 All limited to single protocol, and need a kernel module for
291 each match type and each action
294 Kernel's internal netfilter API is more flexible
297 nftables exposes more of this flexibility, allowing userland
298 to provide firewall code for a specialised VM (similar to BPF)
301 nftables userland tool uses this API and is already packaged
304 Eventually, old APIs will be removed and old userland
305 tools must be ported to use nftables
311 <h1>User-space lockdep [3.14]</h1>
314 Kernel threads and interrupts all run in same address space,
315 using several different synchronisation mechanisms
318 Easy to introduce bugs that can result in deadlock, but hard to
322 Kernel's 'lockdep' system dynamically tracks locking operations
323 and detects <em>potential</em> deadlocks
326 Now available as a userland library! Except we need to package
327 it (build from linux-tools source package)
333 <h1>arm64 and ppc64el ports</h1>
334 <ul class="incremental">
336 'arm64' architecture was added in Linux 3.7, but was not yet
337 usable, and no real hardware was available at the time
340 Upstream Linux arm64 kernel, and Debian packages, should now run
341 on emulators and real hardware
344 'powerpc' architecture has been available for many years,
345 but didn't support kernel running little-endian
348 Linux 3.13 added little-endian kernel support, along with new
349 userland ELF ABI variant - we call it ppc64el
352 Both ports now being bootstrapped in unstable and are candidates
359 <h1>File-private locking [3.15]</h1>
360 <ul class="incremental">
362 POSIX says that closing a file descriptor removes
363 the <em>process</em>'s locks on that file
366 What if process has multiple file descriptors for the same
367 file? It loses all locks obtained through any descriptor!
370 Multithreaded processes may require serialisation around
371 file open/close to ensure they open each file exactly once
374 Hard and symbolic links can hide that two files are really the
378 Linux now provides file-private locks, associated with a
379 specific open file and removed when last descriptor for the
386 <h1>Multiqueue block devices [3.16]</h1>
387 <ul class="incremental">
389 Each block device has a command queue (possibly shared with
393 Queue may be partly implemented by hardware (NCQ) or only
397 A single queue means initiation is serialised and completion
398 involves IPI - can be bottleneck for fast devices
401 High-end SSDs support multiple queues, but kernel needed changes
405 <tt>nvme</tt> and <tt>mtip32xx</tt> drivers now support
406 multiqueue, but SCSI drivers don't yet - may be backport-able?
419 Linux 'Tux' logo © Larry Ewing, Simon Budig.
421 Redistribution is free but has to include this notice.
424 <li>Modified by Ben to add Debian open-ND logo</li>
428 Debian open-ND logo © Software in the Public Interest, Inc.
430 Permission is hereby granted, free of charge, to any person obtaining
431 a copy of this software and associated documentation files (the
432 "Software"), to deal in the Software without restriction, including
433 without limitation the rights to use, copy, modify, merge, publish,
434 distribute, sublicense, and/or sell copies of the Software, and to
435 permit persons to whom the Software is furnished to do so, subject to
436 the following conditions:
438 The above copyright notice and this permission notice shall be
439 included in all copies or substantial portions of the Software.
441 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
442 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
443 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
444 NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
445 LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
446 OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
447 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.