Thursday, January 12, 2017

Fenix LD22 review

I get one Fenix LD22 from its website
It has pretty solid design and specifically optimized for AA size Alkaline or Ni-Mh battery and I think it is designed smartly.
Alkaline or Ni-Mh battery’s normal working voltage range is 1.0 to 1.5V. So with 2x AA size batteries, the voltage supply of the led is actually 2.0 to 3.0.
It uses a Cree XP-G2 LED bulb with near 3.0V nominal voltage. Cree XP-G2 outputs 150 lumens with 350mA current drain and 300 lumens with 750mA current drain.
The flashlight may use a booster converter to boost 2.0V~3.0V to 3V. The converter may have a decent efficiency near this range.
At 300 lumens mode, the converter would draw more than 750mA from battery. Alkaline battery works terribly in this mode due to its discharging curve. Quoted from
enter image description here
With 1A current drain, Alkaline works much worse than Ni-Mh.
At 100 lumens mode, the converter would draw more than 350mA. This is actually sweet region for Alkaline batt. Alkaline outperforms Ni-Mh for current drain below 300mA.
enter image description here
LD22 reaches a decent balance between Alkaline and Ni-Mh. AA size battery can be found everywhere and Alkaline delivers decent performances below 300 lumens mode. Ni-Mh battery covers pretty every mode and is favored in the manual. Eneloop is a perfect match for LD22. 2x count outputs good efficiency.

Thursday, April 28, 2016

History of Coroutine in Python

A coroutine is a kind of function that can suspend and resume its execution at various pre-defined locations in its code. Subroutines are a special case of coroutines that have just a single entry point and complete their execution by returning to their caller. Python’s coroutines (both the existing generator-based and the newly proposed variety) are not fully general, either, since they can only transfer control back to their caller when suspending their execution, as opposed to switching to some other coroutine as they can in the general case. When coupled with an event loop, coroutines can be used to do asynchronous processing, I/O in particular.

Python’s current coroutine support is based on the enhanced generators from PEP 342, which was adopted into Python 2.5. That PEP changed the yield statement to be an expression, added several new methods for generators (send(), throw(), and close()), and ensured that close() would be called when generators get garbage-collected. That functionality was further enhanced in Python 3.3 with PEP 380, which added the yield from expression to allow a generator to delegate some of its functionality to another generator (i.e. a sub-generator).

At first croutine is based on generator in Python. A generator is a function that produces a sequence of results instead of a single value. When a generator function is called, it returns a generator object without even beginning execution of the function. When next method is called for the first time, the function starts executing until it reaches yield statement. The yielded value is returned by the next call.

def yrange(n):
    i = 0
    x = 1
    while i < n:
        yield i+x
        #return i and pause here waiting for instruction from caller.
        i += 1

if __name__ = “__main__":
    g = yrange(6)

Generator will return a value and pause there, all related context is the same when it is resumed again by next() method. Though at first generator is used to create memory-friendly iterator. But its characteristic is very suitable to realize a croutine.

Caller function can use next() as communication pipe to notice generator to run and generator will pause at yield and return value to caller.

But what if caller want to pass an input as parameter for generator to use during its run. E.g. generator provides HTML parsing function, while caller loads the content of a website and pass the content to generator to parse.

By using global variable, it is possible to do this. A modified version of code above:

def yrange(n):
    i = 0
    global x
    while i < n:
        yield i+x
        #return i and pause here waiting for instruction from caller.
        i += 1

if __name__ = “__main__":
    x = 1
    g = yrange(6)
    x = 2

This solution is quite ugly because usually we want to avoid use of global variable. The safest way for programming is to pass value to function as parameter and return value to caller to avoid use state machine as a control mechanism.

Since Python 2.5, yield is changed as expression and behaves like a bidirectional communication tool. Generator has send(), next(), throw(), close() methods.

PEP342 gives a very detailed explanation about this:

Specification Summary
By adding a few simple methods to the generator-iterator type, and
with two minor syntax adjustments, Python developers will be able
to use generator functions to implement co-routines and other forms
of co-operative multitasking. These methods and adjustments are:

  1. Redefine yield to be an expression, rather than a statement.
    The current yield statement would become a yield expression
    whose value is thrown away. A yield expression’s value is
    None whenever the generator is resumed by a normal next() call.

  2. Add a new send() method for generator-iterators, which resumes
    the generator and “sends” a value that becomes the result of the
    current yield-expression. The send() method returns the next
    value yielded by the generator, or raises StopIteration if the
    generator exits without yielding another value.

  3. Add a new throw() method for generator-iterators, which raises
    an exception at the point where the generator was paused, and
    which returns the next value yielded by the generator, raising
    StopIteration if the generator exits without yielding another
    value. (If the generator does not catch the passed-in exception,
    or raises a different exception, then that exception propagates
    to the caller.)

  4. Add a close() method for generator-iterators, which raises
    GeneratorExit at the point where the generator was paused. If
    the generator then raises StopIteration (by exiting normally, or
    due to already being closed) or GeneratorExit (by not catching
    the exception), close() returns to its caller. If the generator
    yields a value, a RuntimeError is raised. If the generator
    raises any other exception, it is propagated to the caller.
    close() does nothing if the generator has already exited due to
    an exception or normal exit.

  5. Add support to ensure that close() is called when a generator
    iterator is garbage-collected.

  6. Allow yield to be used in try/finally blocks, since garbage
    collection or an explicit close() call would now allow the
    finally clause to execute.

yield can not only pop a value to caller but also accept value from caller by send() method.

yield expression is evaluated as None if next() is called. So basically next() equals send(None).

Another example is as below:

def corout1(n):
    for x in range(n):
        yield x

def jumping_range(up_to):
    """Generator for the sequence of integers from 0 to up_to, exclusive.
    Sending a value into the generator will shift the sequence by that amount.
    index = 0
    global jump
    while index < up_to:
        jump = yield index
        #print('step is {0}'.format(jump))
        if jump is None:
            jump = 1
        index += jump

if __name__ == '__main__':
    jump = None
    iterator = jumping_range(5)
    print(next(iterator))  # 0
    print('step is {0}'.format(jump))
    print(iterator.send(2))  # 2
    print('step is {0}'.format(jump))
    print(next(iterator))  # 3
    print('step is {0}'.format(jump))
    print(iterator.send(-1))  # 2
    print('step is {0}'.format(jump))
    for x in iterator:
        print(x)  # 3, 4
    gen1 = corout1(6)
    print(gen1.send('sldf')) #you can send any value

Here I use global variable jump as a probe to detect the inner status of the generator.

In [1]: run
step is None
step is 2
step is 1
step is -1

Here jump = yield index is assignment of yield expression value to jump. If next() is called, yield index will be evaluated as None and generator will continue to run until it hits yield expression again. When the generator is paused, assignment won’t be conducted, the assignment of jump is the first step to be run for next iteration.

Now since next() equals send(None), you can even avoid use of next() to keep consistence. Now yield expression behaviors like a transceiver, it first accept value from caller through send() method and when yield expression is hit again, it will throw the value in yield expression to its caller.

This should works fine as long as you fully understand the mechanism of yield in generator.

In PEP380, yield from is introduced to simplify pipeline of coroutines.

coroutine is not limited to communication only with its caller. It can also send data to another coroutine based input from its caller. This forms pipeline of coroutines.

def writer():
    """A coroutine that writes data *sent* to it to fd, socket, etc."""
    z = 'a'
    while True:
        w = (yield z)
        z = chr(97+w)
        print('>> ', w)
def filter(coro):
    coro.send(None)  # prime the coro
    i = 'hello'
    while True:
            x = (yield i)  # Capture the value that's sent
            i = coro.send(x)  # and pass it to the writer
        except StopIteration:

if __name__=="__main__":
    g = writer()
    f = filter(g)
    print(f.send(None))  #prime f


In [16]: run
>>  7

In order to capture exception, send data to coroutine and get result from coroutine, there are quite a bit of codes to handle.

yield from will take take of exception handling, communication between coroutines, which simplify the code a lot.

Just like yield, yield from is bidirectional operation. A revised version using yield from is like below:

def writer():
    """A coroutine that writes data *sent* to it to fd, socket, etc."""
    z = 'a'
    while True:
        w = (yield z)
        z = chr(97+w)
        print('>> ', w)
def filter(coro):
    #coro.send(None)  # prime the coro
    i = 'hello'
    i = yield from coro

if __name__=="__main__":
    g = writer()
    f = filter(g)
    print(f.send(None))  #prime f


In [18]: run
>>  7

As you can see, it is much easier to organize coroutine code with yield from keyword.

generator based coroutine together with event loop bring ability for async programming.

It is possible to use return in generator.

This is a new feature in Python 3.3 (as a comment notes, it doesn’t even work in 3.2). Much like return in a generator has long been equivalent to raise StopIteration(), return something in a generator is now equivalent to raise StopIteration(something). For that reason, the exception you’re seeing should be printed as StopException: 3, and the value is accessible through the attribute value on the exception object. If the generator is delegated to using the (also new) yield from syntax, it is the result. See PEP 380 for details.

def f():
    return 1
    yield 2

def g():
    x = yield from f()

g() # prints 1

Generator based coroutine is kind of confusing in terms of grammar. It is quite hard to understanding the bidirectional communication mechanism of yield/yield from without some deep introduction like this article. And generator is first introduced to bring one at a time iterator concept, then yield is revised to make generator suitable for coroutine concept.

async/await keywords are introduced since Python 3.5 to make grammar of coroutine programming more meaningful. await equals yield from. So you can either use generator based coroutine or async/await defined coroutine for async/concurrency environment.

Non-blocking program in Python

Async programming is often related with I/O bound or CPU bound tasks. I/O bound task means the code will wait for reading data from another process or thread for quite a time such as page content sent back from server or file content sent back from disk I/O program. CPU bound task often the code waits for results from another process that does heavy computing task.

So async programming naturally contains communications between different processes(thread is not efficient due to GIL). Current running coroutine will pause at where results are needed from another process. Then event loop will take control and send signal to another coroutine to run.

So the coroutine has to be non-blocking, which means it needs to check the status of outer process. If the results is ready, then it will run some code to process the results, otherwise it will yield again to give up the running privilege. The outer process has to be able to provide running status, so that the coroutine can be written in non-blocking style.

It is possible to mimic the async programming through subprocess module. The subprocess module can spawn a process from current process, which is executed independent from current process and can be used to mimic the process of loading content from a remote server or reading data from local disk. simply sleep n seconde and print ‘finished’ as result. When client get the response from server,

import sys
import time

if __name__ == '__main__':
    n = int(sys.argv[1])

import subprocess
import time

def nb_read(t):
    process = subprocess.Popen(['python3', '', t], shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE, universal_newlines=True)

    while process.poll() is None:
        yield None
    yield process.communicate()

task1 = nb_read('5')
task2 = nb_read('3')
fin1 = False
fin2 = False

while 1:
    if not fin1:
        res1 = task1.send(None)
        if res1 != None:
            fin1 = True
            print('still working on task1')
    if not fin2:
        res2 = task2.send(None)
        if res2 != None:
            fin2 = True
            print('still working on task2')
    if fin1==True and fin2==True:

nb_read() is a non-blocking style coroutine to load data. It will spawn a process to read data. subprocess.poll() is the method to poll status from sub process. It the return value is None, then nb_read() will yield None to notify caller data loading is not finished yet and pauses there. Otherwise, nb_read() will use subprocss.communicate() to retrieve stdout content from sub process and terminate the process.

The while 1 loop is used to mimic event loop in asyncio library. It will create two generators of nb_read() and query if they have finished loading data. If so, it will close the generator. The loop will continue to run unless all reading tasks are finished. The loop use send(None) to inform the generator that you can run now.

That is the reason that why the generator based coroutine has to be decorated by @asyncio.coroutine.

Output of the

still working on task1
still working on task2
still working on task1
still working on task2
still working on task1
still working on task2
still working on task1
still working on task2
still working on task1

still working on task1

What is new in Python 3.3
In practice, what are the main uses for the new “yield from” syntax in Python 3.3?
How the heck does async/await work in Python 3.5?

Saturday, April 16, 2016

How to build a cross compliatio enviroment for arm based SBC

It always painful to compile programs for arm-based single board computer such as odroid. This tutorial is aimed to build up a cross compiling system for arm-based programs.

My computer has x64 OpenSuse Tumbleweed installed and the target is to install an armhf(armv7l) ubuntu 14.04 chroot system in it. There will be two sections:

I. use docker to install a ubuntu x64 14.04 container.
II. build up the chroot system inside the ubuntu docker container.

Section I: install ubuntu 14.04 x64 in a docker container.

OpenSuse's document about chroot is quite limited. HDL:chroot describes how to chroot into an armv7 opensuse build from x64 opensuse host. You barely can find any useful documents about how to chroot into an armv7 ubuntu build from x64 opensuse host. I have tried a lot and finally given up the idea to chroot armv7 ubuntu from x64 opensuse host. In terms of documentation and stability, ubuntu is the best linux distro for non advanced users.

if you have a x64 ubuntu host, can you skip this section.

Instead I installed a x64 ubuntu container through docker in the x64 opensuse host and use the x64 ubuntu as host to chroot into an armv7 ubuntu.

The container concept in docker is a bit like the virtual machine though some evaluations from IBM said it has almost the same performances as physical machine in most area. also I have no intent to install a ubuntu os just for compiling purpose. So docker is a good choice compared with virtual machine.

the first step is to install docker:
sudo zypper in docker

there are all kinds of system containers hosted on docker-hub. It is very simple to create a container.
You can "docker search keyword" to search the images whose names contain keyword.

In my case, it is as below:
docker search ubuntu
ubuntu Ubuntu is a Debian-based Linux operating s... 3672 [OK] ubuntu-upstart Upstart is an event-based replacement for ... 61 [OK] torusware/speedus-ubuntu Always updated official Ubuntu docker imag... 25 [OK] ubuntu-debootstrap debootstrap --variant=minbase --components... 24 [OK] rastasheep/ubuntu-sshd Dockerized SSH service, built on top of of... 23 [OK] nickistre/ubuntu-lamp LAMP server on Ubuntu 6 [OK] nickistre/ubuntu-lamp-wordpress LAMP on Ubuntu with wp-cli installed 5 [OK] nimmis/ubuntu This is a docker images different LTS vers... 4 [OK] nuagebec/ubuntu Simple always updated Ubuntu docker images... 4 [OK] maxexcloo/ubuntu Docker base image built on Ubuntu with Sup... 2 [OK] sylvainlasnier/ubuntu Ubuntu 15.10 root docker images with commo... 2 [OK] darksheer/ubuntu Base Ubuntu Image -- Updated hourly 1 [OK] admiringworm/ubuntu Base ubuntu images based on the official u... 1 [OK] jordi/ubuntu Ubuntu Base Image 1 [OK] rallias/ubuntu Ubuntu with the needful 0 [OK] lynxtp/ubuntu 0 [OK] life360/ubuntu Ubuntu is a Debian-based Linux operating s... 0 [OK] esycat/ubuntu Ubuntu LTS 0 [OK] widerplan/ubuntu Our basic Ubuntu images. 0 [OK] teamrock/ubuntu TeamRock's Ubuntu image configured with AW... 0 [OK] webhippie/ubuntu Docker images for ubuntu 0 [OK] konstruktoid/ubuntu Ubuntu base image 0 [OK] ustclug/ubuntu ubuntu image for docker with USTC mirror 0 [OK] suzlab/ubuntu ubuntu 0 [OK] uvatbc/ubuntu Ubuntu images with unprivileged user 0 [OK]
By using "docker pull ubuntu:version", you can download the corresponding image to you local drive.
docker pull ubuntu:latest

#it displays all images pulled from docker-hub.
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE ubuntu latest b72889fa879c 3 days ago 187.9 MB armv7/armhf-ubuntu 14.04.3 73915db97566 8 months ago 184.9 MB
#delete the cached image by specifying IMAGE_ID.
docker rmi IMAGE_ID

Now since you have the image, it is possible to create a container though "docker run"
docker run -it --privileged --name ubuntu ubuntu /bin/bash
#-it means tty interactive
#docker disables many system level operations by default. In my case, I need "--privileged" option to bring full control to the container in order to mount and install qemu-user-static.
#--name option is used to specify a name to the container for easy reference.
#/bin/bash tells docker to use /bin/bash as interpreter.

after running that command, you should see the container is created and it enters the container like ssh.
You can use "uname -a" to confirm it.

type "exit" to exit from the container.

docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 748f592e387b ubuntu "/bin/bash" 2 hours ago Up 2 hours ubuntu
You can see the container is still on

#stop the container from running
docker stop ubuntu

Now you need to re-enter into the container.
docker start ubuntu
docker exec -it ubuntu /bin/bash

It is easy to share files between host and container.
docker cp host_file ubuntu:path
docker cp ubuntu:file host path
#here ubuntu is the container's name, ":path" is the target path inside the container.

Up to now, the ubuntu container is set. The command grammars of docker is also very elegant and easy to use. We will move to section 2. You can also use the container as the test environment of x64 ubuntu.

Section II set up a chroot env for armv7 ubuntu

make sure you enters into ubunut container. commands below are executed in the container.
cd /opt
mkdir rootfs
sudo apt-get install debootstrap schroot qemu qemu-user-static

If there is warning message says:
update-binfmts: warning: Couldn't load the binfmt_misc module.
That is because in latest ubuntu, binfmt_misc is not mounted properly. Using following command to mount binfmt_misc first:
mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc/

#install the necessary packages to build a chroot env.
debootstrap --verbose --variant=buildd --foreign --include=iproute,iputils-ping --arch armhf trusty ./rootfs hold all arm based dists.
#--foreigh options is needed for different arch chroot
#--arch armhf armv7 arch, choose different arch if it is arm64.
#trusty means ubuntu 14.04 release.
#--include specify addtional packages out of standard release.

Cited from
QEMU sports two types of emulation:

syscall emulation: this translates code from one architecture to another, remapping system calls (calls to the kernel) between the two architectures

machine emulation: this emulates a complete computer, including a virtual CPU, a virtual video card, a virtual clock, a virtual SCSI controller etc.

QEMU's syscall emulation is much faster than machine emulation, but both are relatively slow when compared to native programs for your computer. One major drawback of syscall emulation is that it that some syscalls are not emulated, so some programs might now work, also when building programs, these might configure themselve for the features of the kernel they run on instead of what the target architecture actually supports.

For rootfs creation, using syscall emulation is probably fine for core packages, but you might run into issues with higher level ones.

Using syscall emulation

While qemu-arm can be used to run a single armel binary from the command-line, it's impractical to use it to run a program which will fork and execute other programs. Also, because qemu-arm itself uses shared libraries and the dynamic linker/loader from the build environment, it's impractical to copy it in the rootfs to use it with chroot-ed programs. So the following instructions expect that you're using a static version of qemu-arm ("qemu-arm-static"), and that your build environment supports the binfmt-misc module, this modules allows running any executable matching some configured pattern with an interpreter of your choice.

cat /proc/sys/fs/binfmt_misc/qemu-arm
#verify qemu-arm interpreter is register in the container system.

Because we're going to chroot into the rootfs directory, the kernel will look for the interepreter in the chroot; copy the interpreter in the chroot:
cp /usr/bin/qemu-arm-static rootfs/usr/bin

before the chroot, several commands needs to be run to bring up network to chroot env.
mount -o bind /dev     rootfs/dev
mount -o bind /dev/pts rootfs/dev/pts
mount -o bind /proc    rootfs/proc
mount -o bind /sys     rootfs/sys
cp /etc/resolv.conf    rootfs/etc/resolv.conf

now you can chroot into the armv7 ubuntu 14.04
chroot rootfs /bin/bash
/debootstrap/debootstrap --second-stage

uname -a
#check if the system is armv7 arch

#configure system(optional)
locale-gen en_US.UTF-8
locale-gen zh_CN.UTF-8
dpkg-reconfigure locales
export LC_ALL="en_US.UTF-8"
It looks any modification to change the locale in the chroot env won't take effect. Since I only want to use this env to compile programs. So my lazy solution is just to manually export LC_ALL each time.

#add ubuntu repo
echo "deb trusty main restricted universe multiverse" > /etc/apt/sources.list
echo "deb trusty-security main restricted universe multiverse" >> /etc/apt/sources.list
echo "deb trusty-updates main restricted universe multiverse" >> /etc/apt/sources.list
echo "deb trusty-backports main restricted universe multiverse" >> /etc/apt/sources.list

#update system
apt-get update
apt-get upgrade

every time when the docker container is stopped, it looks like docker will clean up the content inside /proc/sys/fs/binfmt_misc. You need to run
update-binfmts --import 
to bring the qemu-arm registration back.

In the end, you can compile a program to make sure the cross compiling system is OK.

phantomjs might be a good example because its dependencies are quite complex though the compiling time is very long.

following the instructions in

git crashes in the chroot env. So my solution is to git the src in the container and move that into rootfs.

it takes around 7-8 hours on my 16GB RAM and quad core HT CPU computer, which i think is pretty good considering three layer stacking structure.


If you want to exit the chroot env. You can type "exit" twice to come back all the way to x64 opensuse host.

Sunday, March 27, 2016

Bottle UWSGI Nginx Configuration

Before starting to read this article, you should know basically this article is a "fork" of Michael Lustfield's article Bottle + UWSGI + Nginx Quickstart.

The reason I want to modify this article is because I think it will be more beneficial if I add more notes I found during building up the whole server. Also I prefer another way to configure uwsgi from the scratch.

Bottle is a Python micro framework easy for deployment though its built-in http server has poor performance and support for https is not easy to implement. In the other side, Nginx is a high performance server software and can be easily configured to support https. But Nginx cannot speak Python. In order to support Python based micro framework, uwsgi is needed as midware to translate Python script to Nginx.

Installing Stuff
I'm going to be assuming the use of Ubuntu 14.04 and python3 and also you have good knowledge about Linux bash and Python. It's easy enough to adjust.
apt-get install nginx  
pip3 install uwsgi  
pip3 install bottle  
pip3 command ensures you getting the latest uwsgi and bottle releases.

Before continue to detailed setup.You should know privileges regarding nginx, uwsgi, and bottle.

If you want to use unix socket file as a tunnel for communication between nginx and uwsgi, you have to make sure both nginx and uwsgi have write/read privileges to the shared socket.

Usually nginx belongs to www-data group. So it is suggested to add current user to www-data group.
usermod -a -G www-data %username
%username is your linux user name.

If you get "502 bad gateway" errors while you think configurations are OK, try to use "ls -l" to verify if the socket is writable to www-data group.

Your First Bottle Application
I tend to start with a basic structure:
Whether I have anything in the directories or not, they exist. It's just how I make sure I keep things consistent across applications.

Don't forgtet to run:
chown -R www-data:www-data /var/www  
#change user and group of folder which contains bottle micor-framework to www-data.  
mkdir /run/uwsgi  
chown -R www-data:www-data /run/uwsgi  
#/run/uwsgi/sock is unix socket file that is going to be used by uwsgi loader in order to communicate with nginx.  
A basic skeleton of will look something like this:
A basic bottle app skeleton  
from bottle import route, template  
def static(filename):  
  Serve static files  
  return bottle.static_file(filename, root='./static')  
def show_index():  
  The front "index" page  
  return 'Hello'  
def show_page(page_name):  
  Return a page that has been rendered using a template  
  return template('page', page_name=page_name)  
if __name__ == '__main__':"", port=8080)  
  app = application = bottle.default_app()  
"""why you needs app = application = ... in uwsgi mode?  
because by default, uwsgi loader search for app or application when it looks into a python script.  
if it is a Flask deployment, use app = Flask(__name__) instead."""
Try it out!:
You'll see the application start running. Go to Neat, huh?

The Templating System
Bottle has a bunch of templating options. For now, we're only going to touch the most basic option.
You are visiting {{page_name}}!
%rebase base
<html dir="ltr" lang="en"> <head> <title>My Site!</title> </head> <body> <div id="pagebody"> %include </div> </body> </html> This is obviously very basic, but it will get you started. Check out the Bottle Docs for more information. The templating options are endless!
Now that you have this done, restart and visit You should be seeing a rather blank looking page that says "You are visiting foo" with the title "My Site!"

Adding UWSGI
The UWSGI configuration is pretty simple. See the UWSGI Docs for more details information.
Edit /var/www/bottle/uwsgi.ini:
socket = /run/uwsgi/sock
;;for bottle, it is safe to use socket protocol.
chdir = /var/www/bottle
master = true
plugins = python
;;plugins are built python*.so libary for uwsgi.
;plugins-dir = path
;;specify python plugins by plugins-dir if you want it.
file =
;;specify the python file to be used by uwsgi.
;;suppose there are all contains app routine.
;;it is possible to use "mount" to define different routines to handle different page requests.
;mount = /
;mount = /
;mount = /
;;generally flask/bottle apps expose the 'app' callable instead of 'application'
;callable = app
;;tell uWSGI to rewrite PATH_INFO and SCRIPT_NAME according to mount-points
manage-script-name = true

uid = www-data
gid = www-data
;;specify uid and gid, assuming /www/var belongs to www-data:www-data. it is better to run uwsgi in non-root mode.
;pythonpath = ..
;;specify python path.
;processes = 4
;thread = 2
;stats =
uwsgi --ini uwsgi.ini
You should see some logging information.

Adding Nginx
I prefer using the /etc/nginx/sites-enable directory for my configurations. You can do as you wish on your server.
Edit /etc/nginx/sites-enable/default:
upstream _bottle {
    server unix:/run/uwsgi/sock;
#upstream defines common block that can be shared by different derivatives. 

server {
    listen [::]:80;
    listen 80;
    server_name localhost;
    #make sure it is accessible through localhost
    root /var/www/bottle;

    location / {
        try_files $uri @uwsgi;
        #try files will try to load $uri, if it does not exist, then @uwsgi will be called instead.

    location @uwsgi {
        include uwsgi_params;
        #include uwsgi parameters.
        uwsgi_pass _bottle;
        #tell nginx to communicate with uwsgi though unix socket /run/uwsgi/sock.
#it is possible to configure this server to https as long as you get certification from letsencrypt. Search nginx + letsencrypt for necessary info if you are interested.

In our bottle application, we defined a route for static content. However, it's better to have nignx serve this data so that we can avoid making python do any work. That's why we use try_files in the location block. You want that in your bottle application for development, but when we deploy, it won't actually get used.
Then restart the service:
nginx -s reload
You'll now be able to access your bottle application from the internet through nginx.
In fact it is possible to verify it though command:
curl localhost:80
w3m localhost:80

start uwsgi as service
It is possible to run uwsgi on boot with upstart.
mkdir /etc/uwsgi
ln -s /path/to/uwsgi.ini uwsgi.ini
create /etc/init/uwsgi.conf,
# Emperor uWSGI script

description "uWSGI Emperor"
start on runlevel [2345]
stop on runlevel [06]


exec uwsgi --master --die-on-term --emperor /etc/uwsgi

upstart will call uwsgi in emperor mode, then uwsgi will create its service based on ini file.

When you want to restart uwsgi process after modifying the python file, never use the "touch-reload" like option from uwsgi. This process could be bugy, you may find it won't work at all and create all kinds of problems. Don't waste your time. Use following commands instead:
initctl stop uwsgi
initctl start uwsgi
the first command ensures uwsgi service is fully stopped.
and the second command will start uwsgi service again.

Saturday, June 27, 2015


安装odroid系统ubuntu 14.04在sd卡上,首先还是要确定sd卡的erase block size等参数来确定分区参数。这里使用的是sony的sr32uya/tqmn。
weiyuan@linux-lnhx:~/Downloads/flashbench-linaro> sudo ./flashbench -a /dev/sdb  --blocksize=1024
align 8589934592    pre 613µs    on 795µs    post 636µs    diff 170µs
align 4294967296    pre 599µs    on 797µs    post 627µs    diff 184µs
align 2147483648    pre 603µs    on 754µs    post 628µs    diff 138µs
align 1073741824    pre 602µs    on 762µs    post 635µs    diff 143µs
align 536870912    pre 604µs    on 760µs    post 630µs    diff 143µs
align 268435456    pre 606µs    on 756µs    post 639µs    diff 134µs
align 134217728    pre 595µs    on 766µs    post 642µs    diff 147µs
align 67108864    pre 598µs    on 761µs    post 636µs    diff 144µs
align 33554432    pre 609µs    on 768µs    post 642µs    diff 142µs
align 16777216    pre 597µs    on 751µs    post 628µs    diff 138µs
align 8388608    pre 596µs    on 769µs    post 653µs    diff 145µs
align 4194304    pre 593µs    on 762µs    post 638µs    diff 147µs
align 2097152    pre 588µs    on 755µs    post 650µs    diff 135µs
align 1048576    pre 599µs    on 765µs    post 645µs    diff 143µs
align 524288    pre 597µs    on 768µs    post 648µs    diff 145µs
align 262144    pre 591µs    on 748µs    post 637µs    diff 134µs
align 131072    pre 597µs    on 758µs    post 649µs    diff 135µs
align 65536    pre 575µs    on 732µs    post 639µs    diff 125µs
align 32768    pre 580µs    on 734µs    post 628µs    diff 130µs
align 16384    pre 585µs    on 732µs    post 623µs    diff 128µs
align 8192    pre 655µs    on 663µs    post 649µs    diff 11µs
align 4096    pre 693µs    on 700µs    post 687µs    diff 9.71µs
align 2048    pre 721µs    on 724µs    post 711µs    diff 8.34µs

Erase block size is 4M from the results and sony sd card does show quite goood speed compared with previous card.

we cannot simply write the img file of ubuntu simply to the sd card.

Thursday, February 26, 2015

beaglebone black简单入门(三)

在这一章中将详细介绍sd卡的分区,格式化和挂载。为什么要这么深入的研究,因为sd卡和emmc所用的flash颗粒是有wear leveling的,现有的操作系统对于如何格式化和访问flash并无很成熟的方案,所以默认的格式化方案对于flash的访问速度和lift time来说并不是最优方案。好的配置可以提高sd卡的读写性能及延长使用寿命。因此仔细的规划sd的格式是非常值得研究的。


一个sd的封装可能包含N个chip,每个chip中还有若干个plane,其中每个plane中又包含若干erase block, 每个erase block还包含若干个page。

其中page是flash的最小读写单位,而flash的特殊之处在于其写入前需要erase整个erase block,将其中的bit全重置为1。也就是说读操作是可以直接读一个page的信息,但是写入时,哪怕只有一个page的信息,也需要erase掉整个block。

从图中也可以看出,并行的chip,同一chip上并行的plan其实是可以同时读写的,这个读写单位在图中叫做clustered block。这和磁盘RAID阵列其实概念非常相近。后面再解释ext4参数时会用到这个概念。

从上面的flash的基本定义,可以看出来,在format的过程中,需要知道sd卡的page,erase block size,和plane等信息才能给出正确的format参数,因为fdisk等工具默认的最小读写unit是512字节,这是旧时代老硬盘的标准,现代硬盘的标准簇实际上已经是4K了,而SD卡等flash介质的的page,erase block size都是不确定的,很难找到官方的标准。

用git clone下载后,进入文件夹sudo make来编译

以我买的scandisk 32G ultra microSDHC(SDSDQU-032G-AFFP-A)为例,这是一款32G的sd卡。

sudo udevadm info -a -n /dev/mmcblk1可以看到emmc的信息

sudo udevadm info -a -n /dev/mmcblk0查看

可以看到emmc的erase size是2M,其大小是7667712*512byte,折合大小约3.65625GB。
beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk1 --count=20
[sudo] password for beaglebone:
align 1073741824        pre 1.97ms      on 2.36ms       post 1.75ms     diff 502µs
align 536870912 pre 2.32ms      on 2.8ms        post 2.13ms     diff 570µs
align 268435456 pre 2.58ms      on 2.85ms       post 2.2ms      diff 464µs
align 134217728 pre 1.95ms      on 2.5ms        post 2.31ms     diff 368µs
align 67108864  pre 2.04ms      on 2.54ms       post 2.15ms     diff 441µs
align 33554432  pre 2.13ms      on 2.62ms       post 2.15ms     diff 480µs
align 16777216  pre 2.1ms       on 2.63ms       post 2.24ms     diff 462µs
align 8388608   pre 2.14ms      on 2.64ms       post 2.17ms     diff 481µs
align 4194304   pre 2.1ms       on 2.63ms       post 2.2ms      diff 474µs
align 2097152   pre 2.07ms      on 2.56ms       post 2.12ms     diff 464µs
align 1048576   pre 2.11ms      on 2.23ms       post 2.25ms     diff 52.5µs
align 524288    pre 2.15ms      on 2.18ms       post 2.21ms     diff -6226ns
align 262144    pre 2.16ms      on 2.18ms       post 2.2ms      diff -1092ns
align 131072    pre 2.17ms      on 2.17ms       post 2.2ms      diff -9687ns
align 65536     pre 2.15ms      on 2.18ms       post 2.21ms     diff 611ns
align 32768     pre 2.16ms      on 2.18ms       post 2.22ms     diff -5497ns

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk1 --blocksize=1024 --count=20
align 1073741824        pre 1.22ms      on 1.53ms       post 875µs      diff 485µs
align 536870912 pre 1.21ms      on 1.57ms       post 947µs      diff 489µs
align 268435456 pre 1.25ms      on 1.72ms       post 1.13ms     diff 532µs
align 134217728 pre 1.18ms      on 1.64ms       post 1.1ms      diff 495µs
align 67108864  pre 1.19ms      on 1.65ms       post 1.09ms     diff 514µs
align 33554432  pre 1.25ms      on 1.74ms       post 1.14ms     diff 545µs
align 16777216  pre 1.24ms      on 1.7ms        post 1.15ms     diff 504µs
align 8388608   pre 1.28ms      on 1.77ms       post 1.16ms     diff 546µs
align 4194304   pre 1.24ms      on 1.72ms       post 1.16ms     diff 520µs
align 2097152   pre 1.23ms      on 1.7ms        post 1.16ms     diff 506µs
align 1048576   pre 1.14ms      on 1.28ms       post 1.16ms     diff 127µs
align 524288    pre 1.13ms      on 1.26ms       post 1.15ms     diff 122µs
align 262144    pre 1.14ms      on 1.26ms       post 1.15ms     diff 117µs
align 131072    pre 1.14ms      on 1.26ms       post 1.15ms     diff 116µs
align 65536     pre 1.13ms      on 1.26ms       post 1.15ms     diff 117µs
align 32768     pre 1.14ms      on 1.26ms       post 1.15ms     diff 116µs
align 16384     pre 1.14ms      on 1.26ms       post 1.14ms     diff 116µs
align 8192      pre 1.14ms      on 1.26ms       post 1.15ms     diff 117µs
align 4096      pre 1.14ms      on 1.26ms       post 1.14ms     diff 120µs
align 2048      pre 1.14ms      on 1.2ms        post 1.14ms     diff 53.7µs
可以看出,从2024到4096速度差有突变,这是由于读取单位最小是4K引起的,所以该emmc的page size是4K。

这里测试的blocksize参数是猜测的page大小,如果和实际的page大小是吻合的话,那么在小于erase block size的范围内读取时间差应该是比较小且一致性好,因为都是在一个erase block size内读两个N*page的数据。

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --count=20
align 8589934592        pre 1.6ms       on 1.66ms       post 1.6ms      diff 55.4µs
align 4294967296        pre 1.6ms       on 1.67ms       post 1.6ms      diff 72.5µs
align 2147483648        pre 1.61ms      on 1.67ms       post 1.6ms      diff 68.7µs
align 1073741824        pre 1.6ms       on 1.67ms       post 1.6ms      diff 73.3µs
align 536870912 pre 1.6ms       on 1.67ms       post 1.6ms      diff 76.3µs
align 268435456 pre 1.6ms       on 1.67ms       post 1.6ms      diff 73.5µs
align 134217728 pre 1.6ms       on 1.67ms       post 1.6ms      diff 75.5µs
align 67108864  pre 1.6ms       on 1.67ms       post 1.6ms      diff 69.1µs
align 33554432  pre 1.6ms       on 1.67ms       post 1.6ms      diff 67.9µs
align 16777216  pre 1.58ms      on 1.67ms       post 1.57ms     diff 93.2µs
align 8388608   pre 1.6ms       on 1.65ms       post 1.6ms      diff 50µs
align 4194304   pre 1.62ms      on 1.75ms       post 1.67ms     diff 103µs
align 2097152   pre 1.63ms      on 1.63ms       post 1.62ms     diff 5.51µs
align 1048576   pre 1.63ms      on 1.63ms       post 1.63ms     diff -2488ns
align 524288    pre 1.63ms      on 1.69ms       post 1.62ms     diff 58.4µs
align 262144    pre 1.63ms      on 1.63ms       post 1.63ms     diff -151ns
align 131072    pre 1.63ms      on 1.63ms       post 1.63ms     diff -1616ns
align 65536     pre 1.62ms      on 1.63ms       post 1.63ms     diff -1548ns
align 32768     pre 1.62ms      on 1.63ms       post 1.63ms     diff 6.12µs

count=20是指读20次以减少误差。这里没有指定读的blocksize,程序自动选择了16384字节(16K)为读取单位。看起来好像是4194304(4MB)是erase block size,但是注意262144到524288同样也出现了突变,这个无法解释。

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=1024 --count=20
align 8589934592        pre 602µs       on 649µs        post 593µs      diff 50.8µs
align 4294967296        pre 613µs       on 670µs        post 596µs      diff 66.1µs
align 2147483648        pre 843µs       on 917µs        post 845µs      diff 73.2µs
align 1073741824        pre 848µs       on 921µs        post 845µs      diff 74.5µs
align 536870912 pre 844µs       on 917µs        post 846µs      diff 71.9µs
align 268435456 pre 842µs       on 916µs        post 845µs      diff 72.3µs
align 134217728 pre 846µs       on 918µs        post 847µs      diff 71.6µs
align 67108864  pre 847µs       on 915µs        post 843µs      diff 69.8µs
align 33554432  pre 842µs       on 921µs        post 847µs      diff 76.2µs
align 16777216  pre 830µs       on 924µs        post 821µs      diff 98.5µs
align 8388608   pre 838µs       on 896µs        post 843µs      diff 55.3µs
align 4194304   pre 878µs       on 961µs        post 823µs      diff 110µs
align 2097152   pre 880µs       on 921µs        post 871µs      diff 45.4µs
align 1048576   pre 878µs       on 915µs        post 877µs      diff 37.9µs
align 524288    pre 883µs       on 972µs        post 867µs      diff 97.5µs
align 262144    pre 874µs       on 915µs        post 878µs      diff 38.8µs
align 131072    pre 876µs       on 921µs        post 871µs      diff 46.9µs
align 65536     pre 873µs       on 909µs        post 874µs      diff 35.5µs
align 32768     pre 880µs       on 914µs        post 870µs      diff 38.6µs
align 16384     pre 874µs       on 910µs        post 871µs      diff 37.2µs
align 8192      pre 870µs       on 906µs        post 870µs      diff 35.9µs
align 4096      pre 827µs       on 905µs        post 869µs      diff 56.7µs
align 2048      pre 828µs       on 830µs        post 827µs      diff 2.05µs
看起来似乎4K是page的大小,而erase block size是4MB,但是524288处的时间差偏大,如果假设erase block size的大小是512K的话,很难解释为什么1M的读数差又变小,看起来更像是猜测的blocksize和实际page不是align引起的。

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=4096 --count=20
align 8589934592        pre 734µs       on 782µs        post 727µs      diff 51.6µs
align 4294967296        pre 803µs       on 874µs        post 799µs      diff 73.1µs
align 2147483648        pre 986µs       on 1.05ms       post 988µs      diff 67µs
align 1073741824        pre 985µs       on 1.06ms       post 984µs      diff 75.8µs
align 536870912 pre 989µs       on 1.06ms       post 987µs      diff 66.9µs
align 268435456 pre 989µs       on 1.06ms       post 983µs      diff 70µs
align 134217728 pre 987µs       on 1.06ms       post 986µs      diff 73.6µs
align 67108864  pre 990µs       on 1.06ms       post 987µs      diff 71.7µs
align 33554432  pre 985µs       on 1.06ms       post 983µs      diff 73.8µs
align 16777216  pre 967µs       on 1.06ms       post 963µs      diff 95.2µs
align 8388608   pre 987µs       on 1.04ms       post 979µs      diff 56.5µs
align 4194304   pre 1.01ms      on 1.11ms       post 963µs      diff 119µs
align 2097152   pre 1.02ms      on 1.06ms       post 1.01ms     diff 43.9µs
align 1048576   pre 1.01ms      on 1.06ms       post 1.02ms     diff 42.7µs
align 524288    pre 1.02ms      on 1.11ms       post 1.01ms     diff 95.5µs
align 262144    pre 1.01ms      on 1.05ms       post 1.01ms     diff 36.6µs
align 131072    pre 1.02ms      on 1.05ms       post 1.01ms     diff 37.9µs
align 65536     pre 1.01ms      on 1.05ms       post 1.02ms     diff 35.8µs
align 32768     pre 1.02ms      on 1.05ms       post 1.01ms     diff 39.4µs
align 16384     pre 1.01ms      on 1.05ms       post 1.02ms     diff 37.4µs
align 8192      pre 1.01ms      on 1.05ms       post 1.01ms     diff 33.1µs

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=8192 --count=20
align 8589934592        pre 921µs       on 967µs        post 913µs      diff 49.5µs
align 4294967296        pre 1.2ms       on 1.27ms       post 1.19ms     diff 69.8µs
align 2147483648        pre 1.2ms       on 1.27ms       post 1.19ms     diff 72.1µs
align 1073741824        pre 1.2ms       on 1.27ms       post 1.2ms      diff 70µs
align 536870912 pre 1.19ms      on 1.27ms       post 1.19ms     diff 76.1µs
align 268435456 pre 1.2ms       on 1.26ms       post 1.2ms      diff 67.7µs
align 134217728 pre 1.2ms       on 1.27ms       post 1.19ms     diff 72.9µs
align 67108864  pre 1.19ms      on 1.26ms       post 1.19ms     diff 69.4µs
align 33554432  pre 1.19ms      on 1.26ms       post 1.19ms     diff 71.1µs
align 16777216  pre 1.18ms      on 1.26ms       post 1.17ms     diff 88.9µs
align 8388608   pre 1.19ms      on 1.25ms       post 1.19ms     diff 58.4µs
align 4194304   pre 1.22ms      on 1.3ms        post 1.2ms      diff 93.3µs
align 2097152   pre 1.23ms      on 1.23ms       post 1.22ms     diff 1.74µs
align 1048576   pre 1.22ms      on 1.22ms       post 1.22ms     diff -4151ns
align 524288    pre 1.23ms      on 1.28ms       post 1.22ms     diff 57.3µs
align 262144    pre 1.22ms      on 1.22ms       post 1.22ms     diff -1387ns
align 131072    pre 1.22ms      on 1.22ms       post 1.22ms     diff 1.01µs
align 65536     pre 1.22ms      on 1.21ms       post 1.22ms     diff -7133ns
align 32768     pre 1.22ms      on 1.23ms       post 1.22ms     diff 5.34µs
align 16384     pre 1.22ms      on 1.22ms       post 1.22ms     diff 211ns

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=32768
[sudo] password for beaglebone:
align 8589934592        pre 2.03ms      on 2.08ms       post 2.02ms     diff 56µs
align 4294967296        pre 2.42ms      on 2.49ms       post 2.42ms     diff 73.5µs
align 2147483648        pre 2.4ms       on 2.49ms       post 2.43ms     diff 76.2µs
align 1073741824        pre 2.42ms      on 2.49ms       post 2.42ms     diff 68.7µs
align 536870912 pre 2.42ms      on 2.49ms       post 2.42ms     diff 72.8µs
align 268435456 pre 2.42ms      on 2.49ms       post 2.42ms     diff 72.7µs
align 134217728 pre 2.41ms      on 2.49ms       post 2.42ms     diff 74.6µs
align 67108864  pre 2.45ms      on 2.52ms       post 2.42ms     diff 89.3µs
align 33554432  pre 2.4ms       on 2.49ms       post 2.43ms     diff 75.4µs
align 16777216  pre 2.44ms      on 2.57ms       post 2.44ms     diff 128µs
align 8388608   pre 2.42ms      on 2.47ms       post 2.42ms     diff 56.9µs
align 4194304   pre 2.45ms      on 2.54ms       post 2.43ms     diff 94.3µs
align 2097152   pre 2.45ms      on 2.45ms       post 2.45ms     diff 1.25µs
align 1048576   pre 2.44ms      on 2.44ms       post 2.45ms     diff -3208ns
align 524288    pre 2.45ms      on 2.5ms        post 2.44ms     diff 58.9µs
align 262144    pre 2.45ms      on 2.44ms       post 2.45ms     diff -6505ns
align 131072    pre 2.44ms      on 2.45ms       post 2.44ms     diff 3.37µs
align 65536     pre 2.44ms      on 2.44ms       post 2.45ms     diff -3327ns


beaglebone@beaglebone:~/flashbench$ factor 62333952
62333952: 2 2 2 2 2 2 2 2 2 2 3 103 197

公因数中有3,考虑到该sd卡有可能是TLC闪存,一位可以存储3bit,所以有可能page size是4K为基础的3的倍数。

103*197并不是2的指数,这不是问题,因为over provisioning已经被很多厂商采用,controller会保留一部分的flash单元专门用于gc等操作,这些单元对用户是不可见的。

对于flash来说,存储单元page仍是以4K*N(N是2的指数)为单位的,但是一个TLC存储单元表示3bit,所以换算成logic page size就是12K*N了。

try 12K
beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=12288 --count=20
align 6442450944        pre 1.4ms       on 1.48ms       post 1.4ms      diff 75.7µs
align 3221225472        pre 1.4ms       on 1.49ms       post 1.4ms      diff 85.1µs
align 1610612736        pre 1.4ms       on 1.49ms       post 1.4ms      diff 92.8µs
align 805306368 pre 1.4ms       on 1.49ms       post 1.4ms      diff 93.7µs
align 402653184 pre 1.4ms       on 1.49ms       post 1.4ms      diff 93.8µs
align 201326592 pre 1.39ms      on 1.49ms       post 1.4ms      diff 94.4µs
align 100663296 pre 1.4ms       on 1.49ms       post 1.4ms      diff 96.1µs
align 50331648  pre 1.4ms       on 1.49ms       post 1.4ms      diff 96.6µs
align 25165824  pre 1.4ms       on 1.49ms       post 1.4ms      diff 89.7µs
align 12582912  pre 1.4ms       on 1.66ms       post 1.55ms     diff 184µs
align 6291456   pre 1.4ms       on 1.4ms        post 1.4ms      diff 1.35µs
align 3145728   pre 1.43ms      on 1.47ms       post 1.43ms     diff 40.7µs
align 1572864   pre 1.43ms      on 1.47ms       post 1.42ms     diff 41.6µs
align 786432    pre 1.43ms      on 1.47ms       post 1.42ms     diff 46.5µs
align 393216    pre 1.43ms      on 1.47ms       post 1.43ms     diff 41.6µs
align 196608    pre 1.43ms      on 1.47ms       post 1.42ms     diff 43.3µs
align 98304     pre 1.43ms      on 1.47ms       post 1.42ms     diff 48µs
align 49152     pre 1.42ms      on 1.47ms       post 1.42ms     diff 42.8µs
align 24576     pre 1.42ms      on 1.47ms       post 1.43ms     diff 41.2µs

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=24576 --count=20
align 6442450944        pre 1.65ms      on 1.72ms       post 1.65ms     diff 69.9µs
align 3221225472        pre 2ms on 2.09ms       post 2ms        diff 88.3µs
align 1610612736        pre 2.01ms      on 2.09ms       post 2ms        diff 87.3µs
align 805306368 pre 2ms on 2.09ms       post 2ms        diff 88µs
align 402653184 pre 2ms on 2.09ms       post 2ms        diff 87.6µs
align 201326592 pre 2ms on 2.09ms       post 2ms        diff 87.4µs
align 100663296 pre 2.01ms      on 2.09ms       post 2ms        diff 88.1µs
align 50331648  pre 2.01ms      on 2.09ms       post 2ms        diff 86µs
align 25165824  pre 2ms on 2.09ms       post 2ms        diff 89.2µs
align 12582912  pre 2ms on 2.28ms       post 2.18ms     diff 187µs
align 6291456   pre 2ms on 2ms  post 2.01ms     diff -660ns
align 3145728   pre 2.03ms      on 2.03ms       post 2.02ms     diff -312ns
align 1572864   pre 2.03ms      on 2.03ms       post 2.02ms     diff 1.23µs
align 786432    pre 2.03ms      on 2.04ms       post 2.03ms     diff 3.9µs
align 393216    pre 2.03ms      on 2.03ms       post 2.03ms     diff 1.68µs
align 196608    pre 2.03ms      on 2.03ms       post 2.02ms     diff 825ns
align 98304     pre 2.03ms      on 2.03ms       post 2.03ms     diff -1041ns
align 49152     pre 2.03ms      on 2.03ms       post 2.03ms     diff -672ns
比较通顺了,看起来24K可以align,且erase block size的大小是12582912,也就是12M。如果24K可以和page align,那么24K*N应该都可以align。

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=49152 --count=20
align 6442450944        pre 2.75ms      on 2.83ms       post 2.75ms     diff 72.3µs
align 3221225472        pre 3.21ms      on 3.3ms        post 3.21ms     diff 89.1µs
align 1610612736        pre 3.21ms      on 3.3ms        post 3.21ms     diff 87µs
align 805306368 pre 3.22ms      on 3.3ms        post 3.21ms     diff 87µs
align 402653184 pre 3.22ms      on 3.3ms        post 3.21ms     diff 86.9µs
align 201326592 pre 3.22ms      on 3.3ms        post 3.22ms     diff 86.3µs
align 100663296 pre 3.21ms      on 3.3ms        post 3.21ms     diff 90.9µs
align 50331648  pre 3.21ms      on 3.3ms        post 3.21ms     diff 89.3µs
align 25165824  pre 3.21ms      on 3.3ms        post 3.21ms     diff 86.9µs
align 12582912  pre 3.21ms      on 3.52ms       post 3.39ms     diff 214µs
align 6291456   pre 3.21ms      on 3.21ms       post 3.21ms     diff 3.22µs
align 3145728   pre 3.24ms      on 3.24ms       post 3.24ms     diff 4.21µs
align 1572864   pre 3.24ms      on 3.24ms       post 3.24ms     diff 4.54µs
align 786432    pre 3.24ms      on 3.24ms       post 3.24ms     diff 2.77µs
align 393216    pre 3.24ms      on 3.24ms       post 3.24ms     diff 2.68µs
align 196608    pre 3.23ms      on 3.24ms       post 3.24ms     diff 5.64µs
align 98304     pre 3.24ms      on 3.24ms       post 3.24ms     diff 7.47µs

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=98304 --count=20
align 6442450944        pre 5.16ms      on 5.22ms       post 5.15ms     diff 67.2µs
align 3221225472        pre 5.8ms       on 5.89ms       post 5.8ms      diff 87.6µs
align 1610612736        pre 5.81ms      on 5.89ms       post 5.8ms      diff 87.9µs
align 805306368 pre 5.81ms      on 5.89ms       post 5.8ms      diff 82µs
align 402653184 pre 5.8ms       on 5.89ms       post 5.8ms      diff 91.2µs
align 201326592 pre 5.81ms      on 5.89ms       post 5.8ms      diff 85.9µs
align 100663296 pre 5.81ms      on 5.89ms       post 5.8ms      diff 88.1µs
align 50331648  pre 5.81ms      on 5.89ms       post 5.8ms      diff 82.6µs
align 25165824  pre 5.81ms      on 5.89ms       post 5.8ms      diff 82.2µs
align 12582912  pre 5.81ms      on 6.12ms       post 6.01ms     diff 213µs
align 6291456   pre 5.81ms      on 5.81ms       post 5.81ms     diff -3359ns
align 3145728   pre 5.86ms      on 5.86ms       post 5.85ms     diff 3.84µs
align 1572864   pre 5.86ms      on 5.86ms       post 5.85ms     diff 1.57µs
align 786432    pre 5.86ms      on 5.85ms       post 5.85ms     diff -5169ns
align 393216    pre 5.86ms      on 5.86ms       post 5.86ms     diff -2854ns
align 196608    pre 5.85ms      on 5.86ms       post 5.86ms     diff 661ns

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench -a /dev/mmcblk0 --blocksize=196608 --count=20
[sudo] password for beaglebone:
align 6442450944        pre 9.72ms      on 9.79ms       post 9.73ms     diff 62.9µs
align 3221225472        pre 10.7ms      on 10.8ms       post 10.7ms     diff 87.5µs
align 1610612736        pre 10.7ms      on 10.8ms       post 10.7ms     diff 77.6µs
align 805306368 pre 10.8ms      on 10.8ms       post 10.7ms     diff 76.1µs
align 402653184 pre 10.7ms      on 10.8ms       post 10.7ms     diff 116µs
align 201326592 pre 10.7ms      on 10.8ms       post 10.7ms     diff 87µs
align 100663296 pre 10.7ms      on 10.8ms       post 10.7ms     diff 89.9µs
align 50331648  pre 10.7ms      on 10.8ms       post 10.7ms     diff 93.7µs
align 25165824  pre 10.7ms      on 10.8ms       post 10.7ms     diff 94.8µs
align 12582912  pre 10.7ms      on 11.1ms       post 10.9ms     diff 247µs
align 6291456   pre 10.7ms      on 10.7ms       post 10.7ms     diff -17098n
align 3145728   pre 10.8ms      on 10.8ms       post 10.8ms     diff 19.5µs
align 1572864   pre 10.9ms      on 10.8ms       post 10.8ms     diff -57897n
align 786432    pre 10.8ms      on 10.8ms       post 10.8ms     diff -6543ns
align 393216    pre 10.8ms      on 10.8ms       post 10.8ms     diff -31925n


对于TLC颗粒来说,就是以8K为page,4M为一个erase block size。

在25165824处读取时间的降低也可以解释。当我们写入是使用的logic block address(LBA),但在sd卡内部,不是所有的颗粒都是线性地址,前面提到sd卡内部可能有多个plane,不同plane上的block其实是可以同时访问的,sd卡的controller通过FTL来将LBA翻译成内部实际的physic block address(PBA)。当我们读取两个不同block上的page时,FTL可以将这两个block映射成sd卡上不同plane的block,读取两个不同plane上的block的时间和读取一个plane上一个block的时间应该相近,所以可以看到25165824之后的读取速度反而提高了。

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --erasesize=$[12*1024*1024] --blocksize=$[12*1024] --open-au-nr=1
12MiB   13.2M/s
6MiB    13.2M/s
3MiB    12.3M/s
1.5MiB  13M/s
768KiB  12.9M/s
384KiB  13.1M/s
192KiB  13M/s
96KiB   11.6M/s
48KiB   11.3M/s
24KiB   6.53M/s
12KiB   3.47M/s

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --erasesize=$[12*1024*1024] --blocksize=$[24*1024] --open-au-nr=1
12MiB   12.2M/s
6MiB    13M/s
3MiB    12.7M/s
1.5MiB  12.9M/s
768KiB  13.2M/s
384KiB  13.6M/s
192KiB  13.4M/s
96KiB   12M/s
48KiB   11.7M/s
24KiB   6.23M/s

beaglebone@beaglebone:~/flashbench$ sudo ./flashbench /dev/mmcblk0 --open-au --erasesize=$[12*1024*1024] --blocksize=$[48*1024] --open-au-nr=1
12MiB   12.8M/s
6MiB    12.7M/s
3MiB    12.9M/s
1.5MiB  13.1M/s
768KiB  13.3M/s
384KiB  13.1M/s
192KiB  13.4M/s
96KiB   12.2M/s
48KiB   11.8M/s

稳妥起见,用24K作为page size,12M为erase block size。

在分区的时候,起始地址使用12MB来对齐12M的erase block size。
sudo -c /fdisk/mmcblk0



>>sudo mkfs.ext4 -O ^has_journal -E stride=6,stripe-width=3072 -b 4096 -L Fedora14Arm  /dev/mmcblk0p1

-O ^has_journal 表示关闭日志记录功能,这个可以减轻写入,但是系统崩溃后文件恢复就很困难,这个可以自己选择是否需要。
-b 4096代表以4K来作为文件系统的簇,最小读写单位。
-L 是创建label
stride可以理解为读的最小单位,而strip-width可以理解为写的最小单位,这是以RAID磁盘阵列为基础的,但是flash的内部读写和RAID有一定的相似之处,所以这里也可以复用。读希望是page的大小,也就是24K,而写希望是12M,一个erase block size的大小,避免read-modif
y-write的操作,以4K为单位计算,stride=page size/cluster size=24K/4K=6,strip-width=12*1024K/4K=3072.

>> sudo mount -o data=writeback,noatime,nodiratime /dev/mmcblk0p1 /media/sd
noatime 不更新node的时间
nodiratime 不更新文件夹的修改时间

beaglebone black简单入门(二)

从(一)中的简单配置,beaglebone black已经可以开始工作了。以下的设置假设操作系统是debian其安装在emmc中,有sd卡作为普通存储介质,emmc只作为系统盘。有可用的网络。







1.输入passwd修改root用户密码,在使用passwd debian修改debian用户密码。

>>useradd beaglebone
该命令将创建beaglebone,处于同名的beaglebone group中。
用groups debian查看debian的用户组权限。
beaglebone@beaglebone:~$ groups debian
debian : debian adm kmem dialout cdrom floppy audio dip video plugdev users netdev i2c admin spi systemd-journal weston-launch xenomai

>>useradd -G adm,kmem,dialout,cdrom,floppy,audio,dip,video,plugdev,users, netdev, i2c,admin,spi ,systemd-journal,weston-launch,xenomai beaglebone


首先用update-alternatives --config editor来修改系统默认的编辑器,我习惯用vim。

su - (注意:su的后面有“-”符号,su - 和 su是不一样的)

2) sudo visudo
在root ALL=(ALL) ALL行的下面添加如下:
your-user-name ALL=(ALL) ALL
如果想在sudo的时候不需输入密码,则在root ALL=(ALL) ALL行的下面添加如下:
your-user-name ALL=(ALL) NOPASSWD: ALL


重启beaglebone,用beaglebone远程登录beaglebone black。

beaglebone black没有ntc模块,所以系统时间无法保存,系统时间不对的话,会引起各种各样的问题,比如更新时间不对,编译出错。


>>date -s mm/dd/yyyy来手工设定时间

>>sudo apt-get update
>>sudo apt-get install ntp ntpdate

安装后ntp service也许有各种问题无法自动更新时间,所以可以用
>>ntpdate -u -s -b pool.ntp.org来手动更新时间



>>dpkg-query -Wf '${Installed-Size}\t${Package}\n' | sort -n


Remove X11 package (GUI)
>>apt-get remove -y x11-common
>>apt-get autoremove

Remove Desktop environment GNOME and GTK
>>apt-get remove libgtk-3-common --purge
>>apt-get autoremove
>>apt-get remove libgtk2.0-common --purge
>>apt-get autoremove
>>rm -r /usr/share/icons
>>apt-get remove gnome-* --purge
>>apt-get autoremove
>>apt-get remove desktop-base --purge
>>apt-get autoremove

>>df -h




首先将router的访问端口由80改为8080,因为80端口是最不太可能被封锁的网络端口,所以当访问beaglebone black时,我希望用80端口访问。

假设beagleone black的局域网ip是192.168.1.7,在虚拟服务器,创建转发规则如下:


这样假设router的外网ip是212.168.1.1,在外部网络用ssh软件登陆212.168.1.1,端口80就可以访问beaglebone black了。ssh软件推荐mobaXterm,功能强大,且登陆后自动打开stfp功能,可以直接上传和下载文件。


自行google debian 中文化
>>vim ~/.profile
export LANG=zh_CN.UTF-8
export LC_ALL=zh_CN.UTF-8




>>sudo mkdir /media/sd
>>sudo mkdir /media/usbhdd

>>sudo chown beaglebone:beaglebone /media/sd
>>sudo chown beaglebone:beaglebone /media/usbhdd

>>chmod -R 755 /media/sd
>>chmod -R 755 /media/usbhdd

sudo mount /dev/mmcblk0p1 /media/sd
sudo mount -t ntfs -o uid=beaglebone,gid=beaglebone /dev/sda1 /media/usbhdd

其中uid和gid是vfat和ntfs分区的权限管理参数,设置为当前用户后,其实对所有用户都是可读写的,所以如果用beaglebone black当多用户服务器的话,一定要小心。


>>sudo chown beaglebone:beaglebone /media/sd
>>sudo umount /media/sd

>>sudo mount /dev/mmcblk0p1 /media/sd

再用ls -l /media/sd查看,发现用户组已经是beaglebone了,以后就可以正常的mount,不用担心无权限访问了。

mount /dev/mmcblk0p1 /media/sd
mount -t ntfs -o uid=beaglebone,gid=beaglebone /dev/sda1 /media/usbhdd


umount /media/sd
umount /media/usbhdd




可以用sudo apt-cache search python3*来选择




网络浏览用w3m和links2,其他不推荐,因为是用ssh隧道,不支持frame buffer,所以是无法浏览图片的。



tmux是很重要的工具,它最重要的一点就是可以运行命令与当前ssh进程隔离,这样如果ssh进程因为网络问题断开后,运行的程序不会因此而终止。请自行google tmux的使用。